Why Early Tests of ChatGPT in Medicine Miss the Mark

STAT News’ Casey Ross interviews Atropos Cofounder Nigam Shah, Washington University’s Philip Payne, and Mark Sendak of Duke Institute for Health Innovation on applications of artificial intelligence in healthcare that enable exchange and analysis of patient data.


Excerpt of the STAT News article:

ChatGPT has rocketed into health care like a medical prodigy. The artificial intelligence tool correctly answered more than 80% of board exam questions, showing an impressive depth of knowledge in a field that takes even elite students years to master.

But in the hype-heavy days that followed, experts at Stanford University began to ask the AI questions drawn from real situations in medicine — and got much different results. Almost 60% of its answers either disagreed with human specialists or provided information that wasn’t clearly relevant.

The discordance was unsurprising since the specialists’ answers were based on a review of patients’ electronic health records — a data source ChatGPT, whose knowledge is derived from the internet, has never seen. However, the results pointed to a bigger problem: The early testing of the model only examined its textbook knowledge, and not its ability to help doctors make faster, better decisions in real-life situations.


Read the full text from STATNews, or more on the recent Stanford HAI study of how well large language models meet clinical information needs.

Atropos Health generates on-demand real-world evidence for healthcare, closing information gaps in patient care with trustworthy clinical insights. Inquire about using Atropos Platform to transform your clinical data into value for your institution and patients.



Previous
Previous

Venture Beat: How Real-Time Data Management is Revolutionizing Healthcare

Next
Next

MedCityNews: Atropos Health Launches Evidence Network to Back Its Physician Consult Service