3 September 2020
Chris Hedenberg

How we use AI to Better Understand Tech E&O Risk (Part II of II)

A story of how AI can be deployed to benefit carriers, brokers and policyholders.

In our first post we covered the reasons why we set out to analyze litigation patterns and described how we used Natural Language Processing (NLP), an AI technique, to solve a sticky issue with the way companies are named. Click here to catch up on Part I.

Now, we’re going to look at how we handled another challenge in building our database of litigation, and consider what it means for insurance companies to be taking this kind of approach.

Oh and, of course, review whether this all worked!

What is a tech company, anyway?

After solving how to deal with variance in company names, we turned our attention to enriching the database with information to enable segmentation. Most importantly we wanted to classify both plaintiff and defendant companies by industry. Since this is Technology E&O, we didn’t want to make assumptions about litigation risk based on defendants from other industries, which may be more or less likely than technology companies to be sued.

Of course, industry class is readily available from a number of data sources. But we’ve found that tech companies in particular are often mis-classified. A provider of EMR software for hospitals may be (not incorrectly) called a “healthcare” company, or something ill-defined like “e-health”. We’d want to differentiate the EMR company from both a healthcare provider and from another tech company like Fitbit that has an entirely different business model. An additional wrinkle is that many of the ways to categorize companies are not in sync with the insurance industry’s NAICS code, which we use as the basis for our own categorizations.

We thought we could better classify companies using information from the people who know the companies best: themselves.

A typical company’s website contains thousands of words describing products and services. So we again turned to NLP — specifically, this time, the BERT technique used by Google to analyze its searches. NLP models like BERT can ingest a large volume of written text and interpret it the way a human being would, as long as it’s been adequately trained. We fed the model websites of tech companies so that it could learn what kinds of language was present on them, then let it loose on our own database.

Having learned what tech companies typically have on their websites, our NLP model was able to re-classify thousands of companies into the preferred buckets, based on how they describe their own products and services.

The Payoff

By now, we had a rich database of legal actions going back many years with the correct plaintiffs and defendants identified and classified, as well as the lawyers and judges involved. You may be asking: after all that, did it actually deliver anything? 

Thankfully, yes!

The database has provided the foundation for a number of uses, most notably to create scoring mechanisms that feed into our proprietary underwriting model. We can now score risk from both the “defendant” side (how risky is tech company “X”, regardless who their customers are) and from the “plaintiff” side (how litigious is customer “Y,” and does that impact X’s risk). The model’s scores align accurately to real-world results, as shown in the chart.

With these scores we can now add litigation risk to the many other rating factors involved in Tech E&O underwriting, for a more well-rounded view of risk. We’re continuing to refine the model and are excited to deploy it in more sophisticated ways. New inputs, like data on settlement amounts, will yield more detailed outputs. We’re also learning a lot about litigation patterns across different industries, company sizes, and other slices of the database, enabling more potential scoring and underwriting measures.

In fact, these scores were key to the development of automated underwriting for Tech E&O, which Corvus launched earlier this year to our broker partners. We’ve started automating quotes for lower risk applications and are working to increase the complexity of risks we can automate.

What does this tell us about the future of insurance?

We felt this story was worth telling not because its ultimate findings were earth-shattering, but because of what it represents: the future of insurance.

Traditional insurers have a few well-worn tools for determining pricing and underwriting rules.  Primarily they have their own claims information, the filings of competitors, and market feedback (whether the price and coverage they quote is competitive). These tools have worked well enough for many lines of insurance for centuries. But in a world where nearly all business activity leaves a digital fingerprint that can be analyzed, the traditional model looks more and more outdated.

Projects like this one represent how insurers can go about expanding the amount of data that feeds into underwriting and proactively improving their risk assessments. They enable us to go broader in scope – by factoring in an entirely new set of risk data – and simultaneously get more granular, by enabling us to drill down to highly specific sub-sets of industry and company size.

Using an example from this project, we can say that a company that initiated at least one lawsuit in the last two years, is in the manufacturing sector, and has over 250 employees presents one of the highest risks of litigation for its IT or software vendors. Even with a highly developed claims database a traditional underwriter would struggle to be so prescriptive.

The introduction of cyber perils into nearly all corners of the P/C risk environment further underscores that change is needed. Claims information is necessarily backward-looking, and does not account, for instance, for rapid changes in the behavior of cyber criminals or the discovery of a major new vulnerability. Just look at the rise of ransomware, and specifically Remote Desktop Protocol (RDP) as a key vector for attack.

Ransomware Attack Vectors - Coveware
Source: Coveware https://www.coveware.com/blog/dont-become-a-ransomware-target-secure-rdp

One could argue that badly secured RDP ports are the most critical factor in determining risk of a ransomware attack. But ratings based on years of claims data used to rate factors like industry class or revenue size won’t tell us whether the insured has this particular risk – we need up-to-the moment data. We need to know if the insured is doing a good job securing its RDP ports today.

Policyholders and Brokers Win

The most direct application of this kind of data is for insurers to make determinations about risk. But its far from the only one.

With a little ingenuity, brokers and their policyholders can see improvements in speed and efficiency in the quoting process. We already mentioned that at Corvus the litigation database has enabled our team to automate underwriting for Tech E&O, resulting in quotes delivered in minutes. This is coupled with a faster application process, since we are able to develop a risk profile from fewer initial pieces of information. Faster quote delivery, less repetitive data entry thanks to shorter forms, and deeper integration with APIs across platforms can all be unlocked when more forms of data are available.

In future iterations of this project, we plan to take the further step of informing policyholders about the litigation patterns that may affect them in their particular industry segment. This will include a Dynamic Loss Prevention scorecard that provides information about the most litigious customers in their areas and claim severity estimations based on analysis of settlement amounts. This will help organizations better manage risk and find the safest ways to success.

***

This is the future of insurance: bringing in new sources of data to broaden the scope of what’s considered for underwriting and ensuring that data is as recent as possible wherever needed. Then sharing that information with brokers and policyholders to make everyone safer from adverse events. Thanks to AI we can accomplish all of this at scale. It’s an exciting moment for insurance, and we’re eager to share more of our progress soon.

Mike Karbassi

Mike Karbassi is Vice President and Head of Cyber Underwriting at Corvus. He specializes in Network Security, Privacy Liability, Technology E&O, Media Liability, and Miscellaneous Professional Liability. Karbassi has over a decade of experience in insurance and is a graduate of the Boston University Questrom School of Business.

Gerritt Graham

Gerritt is the Chief Commercial Officer at Corvus. He has over 20 years of sales and marketing experience, primarily focused on technology and data solutions for the financial services industry.

James McElhiney

James co-founded Corvus and is the company’s Chief Technology Officer. A 30+ year technology veteran, Jaimie most recently served as CTO of Iora Health and previously co-founded Gazelle.

Mike Lloyd

Mike Lloyd is the Co-Founder and Chief Product Officer of Corvus Insurance. Previously, Mike co-founded Poncho, a personal lines agency InsurTech startup, and was a venture investor at FJ Labs. Mike has an MBA from Harvard Business School and engineering degrees from Virginia Military Institute and MIT.

Phil Edmundson

Phil is the founder and CEO of Corvus. A 30+ year insurance veteran, Phil co-founded broker William Gallagher Associates (acquired by Arthur J Gallagher in 2015) and was an active leader in both the Worldwide Broker Network and Council of Insurance Agents and Brokers. Phil is the Managing Partner of Edmus Ventures where he invests in InsurTech companies including Verifly, Wellthie, Agentero, and Cover Wallet, and serves on the board of Cover Wallet.

Play Video