Dive Brief:
- Generative artificial intelligence can be used to pull social determinants of health data, like housing or employment status, from clinician notes to identify patients who need additional support, according to a new study.
- Large language models trained by researchers could identify 93.8% of patients with adverse social determinants of health, while official diagnostic codes include that data in only 2% of cases.
- The finely tuned models were also less likely than OpenAI’s GPT-4 to change their determination when demographic information like race or gender was added. Algorithmic bias is a major concern for AI use in healthcare, amid fears the technology could worsen health inequities.
Dive Insight:
Large technology firms like Microsoft, Google, Oracle and Amazon have invested heavily in generative AI over the past year. Many firms have honed in on using the tools, which can create original text and images, to lessen administrative burden for providers, through use cases like automatically documenting conversations between patients and doctors.
But using large language models to extract social determinants of health data is another area of potential promise for the technology, according to the study published in npj Digital Medicine.
Social determinants, which include information like access to transportation, stable housing and healthy food, are key to health outcomes. But the data often isn’t systematically organized in health records, making it challenging to pull that information into databases or easily identify patients who may need help.
“Algorithms that can notice things that doctors may miss in the ever-increasing volume of medical records will be more clinically relevant and therefore more powerful for improving health,” Danielle Bitterman, a study author and faculty member in the Artificial Intelligence in Medicine Program at Mass General Brigham, said in a statement.
Researchers set out to build language models that could extract that data from health records. They reviewed 800 clinician notes from 770 patients with cancer who received radiotherapy, tagging sentences that referenced social determinants like employment status, housing challenges, transportation issues, parental status and social support.
The researchers then tested the models with an additional 400 notes from patients who had been treated with immunotherapy.
Their finely tuned models could consistently find references to social determinants, even though sentences could be linguistically complex and few sentences in the notes referenced social issues at all, researchers said.
The study’s authors also created synthetic clinical notes using ChatGPT, and used that data to compare their models’ performance against the general language model GPT-4.
Though both model types changed their determinations about social determinants when demographic and gender information was added, the discrepancy rate for the finely tuned models was half that of GPT’s.
Fine-tuning language models could be one strategy to lessen the risk of bias, but more research is needed, Bitterman said in a statement.