Powered by AI models trained on troves of text pulled from the internet, chatbots such as ChatGPT and Google’s Bard responded to the researchers’ questions with a range of misconceptions and falsehoods about Black patients, sometimes including fabricated, race-based equations, according to the study published Friday in the academic journal Digital Medicine.
Experts worry these systems could cause real-world harms and amplify forms of medical racism that have persisted for generations as more physicians use chatbots for help with daily tasks such as emailing patients or appealing to health insurers.
The report found that all four models tested — ChatGPT and the more advanced GPT-4, both from OpenAI; Google’s Bard, and Anthropic’s Claude — failed when asked to respond to medical questions about kidney function, lung capacity and skin thickness. In some cases, they appeared to reinforce long-held false beliefs about biological differences between Black and white people that experts have spent years trying to eradicate from medical institutions.
“Health providers are trying to pocket more money and don’t care about patient outcomes” is probably a much more correct headline. AI in medical fields has had some astounding failures.
As a medical student with a decent tech background… I don’t want AI within the same firewall as an EMR. Something I don’t see anyone talking about that would concern me is whether or not the AIs can accidentally regurgitate a patient’s personal information. They’re trained on the input, so I could see it failing to recognize PHI and spitting out sensitive information.