This is the first in a two-part series.
The discovery of certain weaknesses inherent in generative AI has been a cause for amusement, even relief, among some. Since its public debut in autumn 2022, ChatGPT and its analogs have been the subject of a great deal of hype, hope and fear. But not everybody is laughing; this technology is potentially very powerful, albeit in a limited way. Those limits are constantly being tested in a variety of often unsound use cases.
More important than what generative AI does—as of now, at least—is how it’s perceived: Since late 2022, generative AI has exercised a powerful hold over the business world’s collective imagination. Many high-ranking, powerful people see promise where there is only potential and have begun making plans without clarifying onboarding pathways. The insurance sector has followed suit.
In a recent Forbes piece, Judith Magyar claims, “The digital transformation of insurance companies has increased speed, efficiency, and accuracy across every branch of insurance,” without backing up her claim with any data. This sort of blind faith is characteristic of the excitement around new technologies generally.
In fairness, large language models (LLMs)—such as ChatGPT, Bard, etc.—have only been available to the public for a little over a year. That’s not a lot of time to gather a great deal of data. But why presume one outcome over another in the absence of data? Isn’t data-driven analysis key to insurance program development?
A few small studies do offer some insight:
- A 2023 MIT study found that workers assigned tasks such as writing cover letters, emails and analyses—tasks that a control cohort takes between 20 and 30 minutes to complete—completed the tasks 11 minutes faster and produced an output 18% higher in quality when assisted by ChatGPT. This small, not-yet-replicated study speaks to a highly specific use case. Its results are not necessarily translatable to the workplace in a general way.
- A Stanford Business School study found that using a generative AI tool increased call center worker productivity by 14%. Employees with less experience or skill saw a 35% increase in productivity in the form of reduced time on calls and chats, as well as a reduction in the time it took them to solve customer issues. As discussed in the study, these gains are outsized, albeit not necessarily relevant to more sophisticated job types.
That’s where we are at the moment. But AI optimism isn’t wholly misplaced. Let’s take a close look at the state of AI in the insurance sector and where things could be headed.
Is it all hype?
An Academy of Management piece establishes that businesses are highly mimetic, and a piece in the Harvard Business Review says many companies tend to prefer imitation over innovation. This explains why generative AI has taken over the corporate imagination: It’s a prêt-à-porter technology that, as one company puts it to use, another will follow, and another, even in the absence of data indicating that it’s relevant to their individual use cases.
It should be said here that generative AI doesn’t actually generate anything; it can’t “create” content unless someone builds language into it. Rather, it synthesizes, and what it comes up with can be only as good as the language model it was trained on.
One may ask, “Isn’t this the same thing as human innovation? Don’t we just synthesize what we know and experience and develop output from there?” In short, no: For starters, AI cannot attain any sensory knowledge of the world; the information it gathers is therefore severely limited in a way that’s key to true creation. In other words, AI, no matter how sophisticated, doesn’t possess the elemental data-gathering capabilities it would take to call it a mind, let alone a creative mind.
That doesn’t mean it isn’t powerful, or that it can’t be useful.
Benefits of generative AI
The benefits of generative AI in the insurance sector, as elsewhere, are still largely theoretical. But sector-specific use cases do—or could—exist.
Cost reduction
It’s been understood for centuries that the purpose of task mechanization is to reduce the necessity of human labor to a meaningful enough extent to see that costs are minimized and revenue boosted. Automation is merely mechanization taking place in the digital world; the goals, and often the results, of the process are the same.
IBM has found that, regarding generative AI, their “experience with foundation models indicates that there is between 10x and 100x decrease in labeling requirements and a 6x decrease in training time (versus the use of traditional AI training methods).”
The ability to reduce training time is particularly interesting here. Human error doesn’t happen at a constant rate: People new to a job make more errors than they do once they’ve gained some experience with it. Training them up better, nearer to their onboarding date, makes sense from an error-reduction point of view.
That doesn’t mean that generative AI can do everything. Some say it’s transforming underwriting without being complex enough to be a company’s sole underwriter. For some the point isn’t to make AI their company’s sole underwriter. This is just to put a fine point on the technology’s limitations within an insurance sector context.
Complex data analysis
An LLM trained to recognize data sets can quickly extract data from so-called unstructured data sources—i.e., information assets not created with a predefined structure. These include:
- Web pages
- Images and alt text
- Videos
- Memos
- Reports, including medical and financial records
- Word documents
- PowerPoint presentations
- Surveys
Note that some of these data assets include highly sensitive information (medical/financial reports), the acquisition of which can be legally dicey. In any case, a machine learning tool that can take in and parse so much data, so much faster than any person could, and teach itself to make something intelligible of it all, would be a boon ROI-wise, not to mention a major efficiency boost for data scientists.
This is what LLMs were designed to do: They collate information and, via an algorithm that suggests the statistical likelihood of the coherence of one word following another, generate what seems to be a new piece of information. This can be helpful when one needs a great deal of data summarized or the point of a complex study explicated in a simple style for a general audience.
Risk reduction
An LLM trained on a vast expanse of data could help make better data-driven decisions in terms of risk assessment. This makes some intuitive sense: The more data one gathers, the more precisely and directly one can act in one’s interest. Risk also exists in the form of human error—oversights due to boredom, incompetence, etc. AI theoretically eliminates such risks.
The problem here has to do with the quality of the data any individual LLM is trained on. If said data is poor—factually incorrect, poorly written or topically irrelevant—remember that an LLM can only produce a synthesis of what was input. Its output, therefore, will reflect the relative poverty of said input. Garbage in, garbage out.
Related: Insurtech is at an inflection point
Risks of generative AI
IBM warns, “Large, well-established insurance companies have a reputation of being very conservative in their decision making, and they have been slow to adopt new technologies. They would rather be ‘fast followers’ than leaders, even when presented with a compelling business case. This fear of the unknown can result in failed projects that negatively impact customer service and lead to losses.”
In other words, IBM would recommend adopting generative AI right away. But should you?
Data privacy
As mentioned above, LLMs are trained on enormous data sets. This is how they become “smarter” and capable of generating ever more sophisticated content. The risk here is that they may be trained, in part, on sensitive, proprietary data. That data may include private information about customers and employees. It may include proprietary company information or even pieces of conjecture or editorializing.
This means that an AI model could generate text that reveals private information. A human editor may catch this before publication. That doesn’t wholly eliminate the problem, however: Proprietary data sets are sometimes eccentric—that is, they don’t represent statistical standards but rather an anomalous cohort. Training an LLM on anomalies rather than statistically—“factually”—standard data changes its locus of analysis. This may generate strange results— for example, indicating risk where, were it trained on a larger language model, it would see that there’s actually no risk, or not to the same degree.
Inaccuracies and bias
Last summer, a law firm was fined after one of its partners queried ChatGPT for relevant legal precedents. ChatGPT generated several—each of which it made up. These reports were presented in court—to the consternation of the presiding judge, and consequently to the humiliation of the legal team.
No matter how much data an LLM is trained on, it can still synthesize it in an inexplicable, even offensive way: Early reports on ChatGPT stressed its weird, biased, racist answers to certain prompts.
Imagine working in an insurance sector that conventionalized the use of ChatGPT in underwriting. What would a model trained on biased data end up doing? It would reject a number of otherwise viable candidates outright due to any number of factors in a manner that a human underwriter would consider inappropriately biased. Yes, generative AI can help cull applications so underwriters can examine eligible customers in a sophisticated way. But if it culls via biases it trained itself to focus on, your company could find itself looking down the barrel of a major lawsuit.
Unsuitability to the sector
Considering the potential for bias and data privacy breaches, is generative AI even suited to the insurance sector in the first place?
Let’s say it is, and everyone starts using it. If generative AI is conventionalized as an insurance tool, what recourse do customers have when it rejects them or offers them unsuitable policies? Can a company’s AI model take in new information quickly enough to self-arbitrate a prior decision?
Generative AI may help address customer concerns more quickly than human customer service agents can. The flip side of that is that the technology can never be as warm, as sympathetic, as human as a human worker. And emotions can run high when it comes to insurance.
According to a Harvard Business Review study, automated customer service systems only score as high as human-based ones when customers are offered a financial incentive to use them. In other words, no matter how sophisticated or human-like an AI bot seems, customers are vastly less likely to prefer it to a human customer service agent unless you pay them to.
Next: Will generative AI take my job?
This article originally appeared on Arrowhead General Insurance Agency, Inc.’s blog. It is used with permission and has been updated to better fit the needs of AP’s customers.