Dive Brief:
- Artificial intelligence can improve clinicians’ diagnostic accuracy, but biased models make it worse — and explaining the technology’s logic didn’t help providers mitigate potential errors, according to a recent study published in JAMA.
- The research found an AI model improved clinician accuracy when examining accounts of patients hospitalized with acute respiratory failure, and accuracy increased even more when clinicians’ used AI models alongside explanations of the technology.
- But a systematically biased model decreased their diagnostic accuracy, and explanations of the model’s reasoning didn’t significantly help them improve, according to the study.
Dive Insight:
AI has the potential to help providers find abnormalities in imaging results and analyze patient data to find patterns and make predictions on outcomes.
The technology has become an increasingly hot topic in the healthcare sector, as experts and policymakers try to weigh benefits alongside risks — like injecting biased recommendations that could further widen health inequities.
Systematically biased models, or models that consistently misdiagnose some patient subpopulations, could lead to errors or patient harm, the study notes. For example, a model that’s trained on data where women are often underdiagnosed with heart disease could be inaccurate once it’s deployed.
Ideally, providers would be able to ignore models if they seem off, and explanations have been offered as a way to help them understand the model’s reasoning before they act on recommendations. Guidance from the Food and Drug Administration has highlighted the value of explanations to improve interpretability, researchers wrote.
But the JAMA study found explanations might not be enough to mitigate harm.
The research showed hospital physicians, nurse practitioners and physician assistants nine clinical vignettes of patients hospitalized with acute respiratory failure, including symptoms, physical exams, lab results and chest radiographs.
They were then asked to determine the likelihood of pneumonia, heart failure or chronic obstructive pulmonary disease. Some vignettes included AI models with image-based explanations to lay out its logic.
Clinician accuracy increased over baseline by 2.9 percentage points when using a standard AI model and by 4.4 percentage points when clinicians were also shown AI model explanations.
But a systematically biased model decreased accuracy by 11.3 percentage points, which only improved a nonsignificant 2.3 percentage points when the model included explanations for its recommendations.
“Although standard AI models improve diagnostic accuracy, systematically biased AI models reduced diagnostic accuracy, and commonly used image-based AI model explanations did not mitigate this harmful effect,” researchers wrote.
Clinicians’ AI literacy might one challenge preventing them from rooting out biased recommendations, the study’s authors said. Nearly 67% of participants weren’t aware models could be systematically biased.
They also might need more experience and training with the explanations. And using images might be a poor way to describe what the model is doing, and other methods, like text descriptions, might be more helpful, researchers wrote.