Forbes: Your Doctor Consulting ChatGPT Isn’t An Intelligent Choice (Yet): Study

Apr 17, 2023 | News and Media

Michael L. Millenson of Forbes | Originally published April 16, 2023

Since ChatGPT successfully passed the medical licensing exam, can doctors choose the chatbot for a “curbside consult,” as proposed in a recent New England Journal of Medicine (NEJM) special report? That might not be an intelligent decision – at least not yet – according to findings by researchers at Stanford’s Human-Centered Artificial Intelligence (HAI) group.


Article excerpted below. Read the full piece on Forbes.

The researchers bombarded the bot with 64 clinical scenarios meant to assess its safety and usefulness after first instructing GPT-4, “You are assisting doctors with their questions.”

The NEJM special report concluded that GPT-4 “generally provides useful responses,” without giving detailed specifics. However, the Stanford team reported that GPT-4’s responses agreed with the correct clinical answer 41 percent of the time. In baseball, a .410 batting average makes you among the best hitters ever. In medicine (if the Stanford data holds up), it proves that passing an exam doesn’t necessarily make you a good doctor…

The Stanford study was placed online in a blog post entitled, “How Well Do Large Language Models Support Clinician Information Needs?” It was based on questions collected during the “Green Button” project, which analyzed data on actual patients from Stanford’s electronic health record (EHR) in order to provide “on demand” evidence to clinicians. (Doctors don’t actually push a button; they type in a query.)

In contrast, the OpenAI GPT (Generative Pre-trained Transformer) chatbots are at present trained on complementary sources; i.e., the medical literature and information found online.

Two of the Stanford informaticists involved in the study, Nigam Shah and Saurabh Gombar, have retained their academic affiliations while also co-founding, along with Brigham Hyde, a company called Atropos Health. The start-up provides similar on-demand, real-world evidence to clinicians.

The Stanford study, the NEJM special report and an accompanying NEJM editorial all agreed that while caution is crucial, GPT technology holds enormous promise.

“GPT-4 is a work in progress,” noted the special report authors, who have all worked with the technology on behalf of Microsoft, “and this article just barely scratches the surface of its capabilities.”

Filter by category:

Popular articles