There has been a noticeable buzz around some startup auto insurers that are leveraging big data and AI to more accurately price risk in the personal auto market. They contend that minority groups are disparately impacted by traditional rating metrics and believe big data and AI can more accurately reflect an individual’s risk profile. Additionally, these startups claim that sharing pertinent risk data with their customers will help reduce claims and save costs.
Let me begin by saying that I love how these startups are grabbing the bull by the horns and are changing the way consumers and businesses are thinking about insurance. It takes a lot of guts, gumption, and grit to be able to pull it off. And that is commendable. That said, I would like to step back and offer a few critical thoughts on this effort. I will start by providing some context and then I will argue two points:
- If the traditional rating metrics accurately measure risk, then the disparate impact argument against them has no standing.
- At best, informing consumers of pertinent risk data will likely have a negligible effect. At worst it will shift the risk to somewhere else and might even negatively impact minority groups.
Let’s begin by taking a quick look at traditional rating metrics.
Current Rating Metrics:
Insurance carriers use a wide array of information to determine appropriate rates. This ranges from past claims history and driving experience to demographic information such as ZIP code, marital status, homeownership, educational attainment, and insurance scores. Carriers also use proprietary weighting, meaning the degree to which any one metric affects the rate is different from carrier to carrier and even from person to person. For example, many carriers place a lot of weight on one’s past claims history. But if the person has limited driving experience the carrier will weigh other demographic metrics more heavily. The demographic metrics are what startups are attempting to get away from as they seem to be disparately impacting minority groups.
Disparate impact, as a reminder, is not concerned with intent, rather it is concerned with the outcome. For example, an employer could implement a non-discriminatory rule that ends up adversely impacting a protected class. In court, the employer could defend themselves by showing the rule was a business necessity. However, if the employee can show that a nondiscriminatory alternative exists then the employee would likely prevail (Mann & Roberts, 2011). There is a difference between fair and unfair discrimination and most discrimination is not the bad kind. A hiring manager, for insistence, discriminates between job applications based on resume strength and personality, but not on ethnicity. Similarly, all rating metrics, whether traditional or AI-derived, seek to fairly discriminate between people that present higher and lower risks. Carriers are not using ethnicity or race to determine rates, but the startups argue that they are using other demographic metrics that act as proxies for ethnicity and race.
With insurance, the customer base is a wide cross-section of society that includes all protected classes. So, if using certain rating metrics disparately impacts a protected class and there are other nondiscriminatory alternatives, well then, we have an issue to address. The startups plan to offer the market a nondiscriminatory alternative. In particular, they plan on taking aim at insurance scores.
Some minority populations have lower average credit scores than others. So, in theory, and to some degree, their insurance rates would also be higher on average. Again, to what extent an insurer weighs an insurance score fluctuates. The argument against including this as a rating factor is that structural racism is largely to blame for poor credit scores in minority populations and therefore, it is unfair to include it in rating.
In her Washington Post article, Michelle Singletary explains how redlining, discriminatory employment, and prosecution practices have led to black people having lower credit scores on average (Singletary, 2020). In fact, according to Shift Processing, the black population has the lowest average score at 677 whereas the Asian population has the highest average score at 745 (Shift Credit Card Processing, 2021). It follows then that using a credit score as a proxy for risk would desperately impact the black population. And the same could be argued for the Hispanic population, but to a lesser degree.
But insurance scores are not the same as credit scores. According to the Insurance Information Institute, insurance scores use the information contained in a credit report, but it does not use that information in the same way that says Equifax uses it to create a FICO score. Insurance scores do not measure creditworthiness (National Association of Insurance Commissioners, 2021). Even still, accounting for the differences between what constitutes a credit score vs an insurance score does not solve the problem. A disparate impact argument could also be made for using other demographical metrics such as education, marital status, and homeownership. That said, according to many studies, using insurance scores and other demographical metrics are good predictors of risk. But what the hell does being married or paying debts on time have to do with driving habits?
Correlation & Causation:
Most risk indicators look at correlations. Correlations simply indicate if there is a relationship between two variables and if there is, how strong that relationship is. For example, according to the Insurance Institute for Highway Safety, both age and gender show positive correlations in risk. On average, young folks tend to present a higher risk than older folks, and young males present a higher risk than young females (Insurance Institute for Highway Safety, 2021). Admittedly, looking at a correlation coefficient between two variables does not tell us exactly what causes what, but it does tell us that the casual link(s) exist somewhere between them.
I can personally speak to my experience as being a young male with insurance premiums so high that I could barely afford to drive my car. As a late teen, it was not uncommon for me and my friends to meet up on a dark stretch of road on a Friday night and see if we could max out the speedometer on my buddy’s beat-up ‘78 Camaro. Our newly found influx of testosterone had us cheering on the struggling V8 as we watched the speedometer approach three digits. How else were we going to measure how effective new sparkplugs and high-octane gas were?
From there we would usually end up at the sandpit hanging around a bonfire and planning our next move (which usually involved burning rubber and explosions). Contrast that with my middle-aged neighbor who spent most Friday nights putting her kids to sleep and relaxing after a long week of work. There is no doubt that I should have been charged a higher premium than her. In this case, the rating was spot on. But let us consider a counterexample.
You could have an 18-year-old male, let’s call him Proper Pat, who is a safe and courteous driver and takes trips to the local coffee shop to enjoy a donut and study for his SATs. Conversely, you could have a 36-year-old female, Daring Debbie, who is an adrenaline junkie and enjoys doing donuts in her Mustang after smoking a joint. In this case, the insurers would likely get the rates backward. Similarly, not all folks who own homes and are married present a lower risk. But these latter cases are the exceptions, and in the aggregate, the correlations work very well. But what if these exceptions disproportionately fall onto the laps of minority groups? What can we do? What is the ideal situation?
Ideal Rating Method:
The philosopher John Rawls is well known for what he called the “veil of ignorance.” The short version goes something like this: The folks organizing society and making the rules should do so behind a veil of ignorance such that they do not know anything about their natural abilities or their position in society when they are designing it. Therefore, they have an incentive to be more egalitarian and be considerate of all possible circumstances. In other words, I will cut the cake and you decide who gets the pieces. So, the designers of a Rawlsian rating system would operate under a veil of ignorance, not knowing their demographic information so as to provide the fairest rating system possible. With that, what would be the ultimate goal of such a system?
The ultimate goal is very simple; to apportion people’s premiums with the actual risk they present to the insurance pool. So Propper Pat would pay less in premium than Daring Debbie even know Pat is a younger male. To achieve this, carriers would need to isolate the causal risk factors for each individual while keeping in mind that not everyone will have the same causal factors. Moreover, they would also have to consider to what extent other variables influence the particular causal factor. For example, an underdeveloped prefrontal cortex in a teenage male might be a causal factor for driving recklessly. But the degree to which he is egged on by his crazy friends or tamed by his responsible parents would amplify or reduce the outcome associated with that causal factor. You can see how this quickly turns a simple and laudable goal into a complex and seemingly insurmountable goal.
So maybe a better question to ask is how we get closer to the ultimate goal knowing that we may never reach it. The startups seek to answer this question with robust data and AI. A very important assumption here is that traditional metrics are less accurate at predicting risk than the more granular metrics offered by the startups. This brings us to the first criticism:
If the traditional rating metrics accurately measure risk, then the disparate impact argument against them has no standing. Imagine Daring Debbie purchases a new policy with one of the startups. After a couple months her rate gets jacked up because the granular data show she regularly speeds and travels on dangerous roads late at night. Consequently, Daring Debbie switches to a legacy carrier. The legacy carrier does not see the granular data that the startup saw but they see that Daring Debbie has a poor insurance score, two speeding tickets, is not married, and does not own a home. As a result, the traditional carrier using traditional rating metrics offers Daring Debbie the same rate as the startup did. If a scenario like this plays out in the market, then it would show the disparate impact argument to have no standing.
Conversely, if the data show that traditional metrics were not accurate and minority groups were disproportionately impacted because of demographic metrics that do not reflect risk, we should be prepared to acknowledge the problem and make all the necessary changes. Some of the startups are operating as if the question has already been answered. I am not convinced the question has been answered, but I think absolutely needs to be.
Using big data to inform insureds of risk will either have a negligible effect or will simply shift the risk to somewhere else. And the worse case, it could create incentives that are antithetical to the social causes the startups are working to promote.
Using data to inform insureds of high-risk activities such as driving at night, driving on dangerous roads, or traveling the day before Thanksgiving would seem to be a win-win. Insureds would presumably reduce their risk and insurers would pay out less in claims. However, this assumes most people are rational, risk-averse, and sensitive to negligible savings in premium. If behavioral economics taught us anything it is that people are not as rational as we think.
For instance, most folks know that speeding, eating crap food, and not exercising all increase their risk of health issues or even death, yet they are not deterred. Think about it. After driving past a horrific car wreck on the highway we tend to drive slowly for about 20 minutes after we pass it. Then the salience fades and the speed goes back up. At the last minute, we see a sign for a taco joint and cut across two lanes of traffic so as not to delay stuffing our faces any longer than is absolutely necessary. With that, I do not think people will heed the advice given by the startups, despite its accuracy. Further, if the advice were heeded then it might shift the risk to somewhere else and it could even adversely impact minority groups.
Let us say, for example, that the startup app tells us to avoid the main intersection in town because it is also the most dangerous road to travel on. If we heed this advice that would increase traffic on the side roads and push the risk to somewhere else. Perhaps those side roads have more kids playing on them and instead of crashing into cars, we are now crashing into bicycles.
Further, what happens if the data show that the most dangerous streets and intersections are in areas with higher population densities? And what if these areas also have a disproportionately high number of minority groups living and working there? Not only would premiums be higher for the people who live there, but premiums would also increase for people who travel there for work and play. Would this premium increase create a negative incentive that is tantamount to modern-day redlining? In other words, people would be rewarded for avoiding risky areas and would possibly pull businesses away from minority groups. Far-reaching, maybe. But plausible.
If we look at a couple of examples, there is some credence to the possibility outlined above. According to various statistics from Statista, 44% of the black population in Illinois live in Chicago contrasted with just 18% of the white population. 75% of the black population in New York live in NYC with just 33% of the white population (Statista, 2021). In other words, if population density is correlated with dangerous streets and intersections then the Black populations in Illinois and New York would be paying higher rates on average. I am genuinely curious how the startups would sort that out.
It comes down to accurately matching the rate to the risk. The startups contend that some of the traditional rating metrics are arbitrary and do not accurately reflect risk. They go further to add the social justice piece that claims these arbitrary metrics disparately impact minority groups. However, if the traditional rating methods accurately match the rate to the risk, then there is no disparate impact issue as it is a sort of business necessity for insurers to charge higher premiums for higher risk.
If new technology can be leveraged to discover and quantify individual causal links to risk factors and, consequently, move away from using averages then we should work toward developing and adopting it. At the same time, if this new data show substantially the same thing that the traditional data did then we should acknowledge it as such and point the social justice flashlight to somewhere that needs it. However, if the startups can show insurance scores, homeownership, and other demographic data that do not map on to the actual risk then changes need to be made and perhaps premium credits should be issued. Let’s see what is standing once the dust settles.
Bennett-Alexaner, D. D., & Hartman, L. P. (2012). Employement Law for Business. New York: McGraw-Hill.
Insurance Institute for Highway Safety. (2021, March 1). Fatality Facts 2019 Males and Females. Retrieved from IIHS: https://www.iihs.org/topics/fatality-statistics/detail/males-and-females
Mann, R. A., & Roberts, B. S. (2011). Business Law. Mason: South-Western Cengage Learning.
National Association of Insurance Commissioners. (2021, Jan 1). Understanding How Credit Scores Impact Your Premium. Retrieved from NAIC.org: https://content.naic.org/article/consumer_insight_creditbased_insurance_scores_arent_same_credit_score_understand_how_credit_and_other_factors.htm
Shift Credit Card Processing. (2021, January 1). Credit Score Statistics. Retrieved from Shift Processing : https://shiftprocessing.com/credit-score/#race
Singletary, M. (2020, October 16). Credit scores are supposed to be race-neutral. That’s impossible. Retrieved from The Washington Post: https://www.washingtonpost.com/business/2020/10/16/how-race-affects-your-credit-score/?arc404=true
Statista. (2021, February 1). Retrieved from Statista.com: https://www-statista-com.ezproxy.lsus.edu/
The Insurance Information Institute . (2021, March 1). Credit and Insurance Scores. Retrieved from III.org: https://www.iii.org/article/what-does-my-credit-rating-have-do-purchasing-insurance-1