AI-Driven Insurance Decisions Raise Serious Questions About Human Oversight in Health Care
Health insurance companies are increasingly turning to artificial intelligence to help decide whether medical treatments, drugs, and procedures should be covered. The goal is efficiency: faster decisions, lower administrative costs, and less paperwork. But according to new research from Stanford University, this growing reliance on AI is also raising major concerns about human oversight, transparency, and fairness in health insurance decisions.
The issue is explored in depth in a recent article published in the journal Health Affairs, authored by Michelle M. Mello and three Stanford colleagues. Their work examines how AI is being used in insurance utilization review and why this trend, if left unchecked, could intensify existing problems in the health care system rather than fix them.
Why insurers are embracing AI so quickly
Utilization review is the process insurers use to decide whether a requested medical service is appropriate and eligible for coverage. This includes prior authorization, a practice that requires doctors to get approval from insurers before providing certain treatments. For years, physicians and patients have criticized prior authorization for causing delays, denials, and administrative burnout.
From an insurerโs perspective, AI appears to offer a solution. Algorithms can quickly check coverage rules, analyze large volumes of requests, and flag cases for approval or denial. According to a 2024 survey by the National Association of Insurance Commissioners, 84% of large health insurers across 16 U.S. states reported using AI for at least some operational purposes. That level of adoption shows how deeply AI is already embedded in the insurance industry.
However, the Stanford researchers argue that speed and scale come with serious trade-offs.
The problem with limited human review
One of the most significant concerns raised in the study is the lack of meaningful human oversight when AI systems make recommendations. While insurers often claim that humans remain โin the loop,โ the reality is more complicated.
Human reviewers may not have enough time to carefully examine AI-generated decisions. They may also lack the technical understanding needed to evaluate how the algorithm reached its conclusion. On top of that, reviewers may face incentives that subtly encourage rubber-stamping AI recommendations rather than challenging them.
This combination creates a risk that wrongful denials of care could occur without anyone catching the mistake. The researchers emphasize that when AI tools are treated as authoritative rather than advisory, human oversight becomes largely symbolic.
Opacity and the challenge of accountability
Another major issue is the opacity of AI algorithms. Many of the systems used by insurers function as black boxes, meaning neither patients nor providers can easily understand why a particular decision was made.
When an insurance claim or prior authorization request is denied, patients and doctors often want to know the reasoning behind it. If that reasoning is buried inside a complex algorithm, it becomes much harder to challenge or appeal the decision. This lack of transparency undermines trust and limits accountability.
The study points out that opaque decision-making is especially troubling in health care, where coverage determinations can directly affect patient outcomes.
Missing context and baked-in bias
AI systems are only as good as the data they are trained on. The Stanford researchers highlight several ways this can go wrong in insurance settings.
For example, AI tools that assess whether a patient is ready to be discharged from a rehabilitation hospital often lack data about social supports at home, such as whether the patient has family members who can help with care. Without this information, the algorithm may recommend decisions that are clinically inappropriate or unsafe.
There is also the issue of historical bias. Algorithms trained on past insurance decisions may reinforce flawed practices that already exist. If previous coverage determinations were too restrictive or unfair, AI systems trained on that data will likely replicate those patterns rather than correct them.
Compounding the problem is the fact that many insurers do not have strong governance processes in place to monitor accuracy, detect bias, or audit outcomes associated with AI tools.
The broader context of prior authorization failures
The researchers place AI-driven insurance decisions within the larger context of prior authorization, which has long been a source of frustration in health care.
Even before AI entered the picture, studies showed high denial rates for prior authorization requests and extremely high reversal rates on appeal. In Medicare Advantage plans, for example, 82% of appealed denials were overturned, suggesting that many initial denials should never have happened in the first place.
From 2019 to 2023, Medicare Advantage plans approved more than 93% of prior authorization requests, meaning most requests are ultimately deemed appropriate. This raises an obvious question: if most requests are approved anyway, why subject patients and providers to such a burdensome process?
Where AI could actually help
Despite their concerns, the Stanford researchers are careful to point out that AI is not inherently harmful. When used thoughtfully, it could address some of the very problems it is accused of worsening.
One potential benefit is the automation of clearly allowable requests. If AI systems can reliably identify requests that obviously meet coverage criteria, those cases could be approved automatically. This would reduce delays, lower stress for patients and clinicians, and allow human reviewers to focus on complex or borderline cases.
AI could also reduce denials caused by incomplete or unclear submissions. Insurers typically do not have direct access to full medical records and instead rely on summaries prepared by non-clinical staff. AI tools can help extract relevant clinical data, flag missing information, and guide staff on medical necessity requirements.
Another promising area is appeals and communication. Predictive tools can identify which denials are most likely to be overturned, while generative AI can help draft appeal letters by synthesizing clinical information. AI could also make Explanation of Benefits letters easier for patients to understand and help insurers meet regulatory requirements to provide specific reasons for denials.
The data gap and public trust problem
One of the most frustrating aspects of this debate is the lack of publicly available data. Insurers often argue that AI improves efficiency and accuracy, but they rarely share evidence to support those claims.
The researchers note that many insurers do not appear to have internal systems robust enough to evaluate whether AI is actually improving outcomes or simply shifting errors around.
This lack of transparency feeds into a broader trust problem. Surveys show that two-thirds of U.S. adults have little trust that AI will be used responsibly in health care. Health insurers, already among the least trusted actors in the system, face an uphill battle when introducing opaque technologies into coverage decisions.
Not surprisingly, the use of AI in insurance has already sparked controversy and litigation, with patients and providers questioning whether automated systems are being used to deny care unfairly.
What this means for the future of health insurance
The Stanford study makes it clear that AI in health insurance is not going away. The question is whether it will be deployed in ways that genuinely improve care or whether it will supercharge existing flaws.
Strong governance, meaningful human oversight, transparency, and public accountability are not optional extras. They are essential safeguards if AI is to play a constructive role in insurance decision-making.
Without them, the promise of efficiency may come at the cost of fairness, trust, and patient well-being.
Research paper reference:
https://doi.org/10.1377/hlthaff.2025.00897