A story of how AI can be deployed to benefit carriers, brokers, and policyholders.
We can’t believe spring is already here! We’ve been launching products, opening new offices, and pushing our technology to new heights.
Artificial Intelligence (AI) and insurance is a marriage often discussed on conference panels and in investor pitches, mostly in terms of its potential. Unfortunately for those of us working in Data Science, we see less said about how AI is actually deployed in insurance.
We’re hoping to change that. Today we’re going to share part one of a two-part story on how AI is deployed for insurance. Enjoy!
Tech Companies and Their Risks
Tech E&O (short for Technology Errors and Omissions Insurance) is a line of specialty insurance that covers the legal liability and resulting financial risk that providers of software, IT or professional services face if the products or services they offer fail or are breached, resulting in harm to their customers.
One important factor in Tech E&O are “cyber” risks like ransomware or other cyberattacks. Tech providers and services companies are often not as secure as they (or others) think they are, as witnessed by the rise in successful attacks on MSPs. A recent real world example is of Blackbaud, a cloud service provider, which experienced a ransomware attack that caused disruptions at many of its customers.
But there is another side of the coin: the risk of the clients taking costly legal action against their vendor if an incident occurs. Just as companies vary widely when it comes to cybersecurity posture, they also vary when it comes to litigation risk. Two software companies that experience the same cyberattack may have much different financial outcomes simply because one was sued for significant damages by a large client and one wasn’t.
But to do that we’ll need to do some data science (and yes, make use of AI techniques).
Data Analysis Creates An Opening
To understand if a project like this was viable, the Corvus Data Science team first analyzed litigation patterns using information from thousands of available legal filings.
The major finding: a remarkably strong correlation in the likelihood of a company’s rate of litigation year over year. In other words, a company that initiated lawsuits against 5 different companies one year went on to initiate lawsuits against nearly the same number of new companies in the following year.
With this promising result, the question then became: Can we effectively analyze any company’s litigiousness, and therefore estimate the risk they present to a company that has them as a customer?
We thought we could, but knew we’d need a trove of data much richer than the one we'd used initially. In building this enriched database, we quickly encountered some challenges. That's where things got interesting (from a data scientist's perspective).
“A Holding Company By Any Other Name…”
The first challenge we faced was dealing with company names. Looking at a mass of unstructured data, we noticed a huge variety in names of businesses listed in legal documents. There are alternate spellings of names (Johnson & Johnson, J&J, jnj), and complex parent/child relationships (DePuy Synthes > Johnson & Johnson Medical Devices Companies > Johnson & Johnson). In order to build a functional database, we needed to know which companies were actually involved in litigation.
Using a machine learning technique known as Natural Language Processing (NLP), we trained a machine to perform web searches for company names and return resulting domain names (that is, websites, like www.jnj.com). By finding a common website domain, we can group companies that go by many different names under one unique identifier. The machine “learned” as it went to improve the accuracy of its results, and we rounded out the results with some additional data from a third party to solve for parent companies that do not share a web domain with their subsidiaries.
First challenge solved.
But company names are not in themselves terribly useful for analysis. We needed to enrich the database with more information about each company in order to do a full analysis. Next week, we’ll talk about other ways we used NLP to build up the database, and talk about the big question: did it actually work? Check out the next installment in this series below: