A manuscript titled, “Answering real-world clinical questions using large language model, retrieval-augmented generation, and agentic systems,” was published in Digital Health by authors affiliated with Atropos Health, Stanford University, The Hospital for Sick Children, Columbia University, University of Michigan, and the University of California, San Diego. 

Short Summary:

The practice of evidence-based medicine can be challenging when relevant data are lacking or difficult to contextualize for a specific patient. Large language models (LLMs) could potentially address both challenges by summarizing published literature or generating new studies using real-world data. The study submitted 50 clinical questions to five LLM-based systems: OpenEvidence, which uses an LLM for retrieval-augmented generation (RAG); ChatRWD, which uses an LLM as an interface to a data extraction and analysis pipeline; and three general-purpose LLMs (ChatGPT-4, Claude 3 Opus, Gemini 1.5 Pro). Nine independent physicians evaluated the answers for relevance, quality of supporting evidence, and actionability (i.e., sufficient to justify or change clinical practice).

Key Conclusions: 

General-purpose LLMs rarely produced relevant, evidence-based answers. Retrieval-augmented generation-based LLM (OpenEvidence) performed well when existing data were available, while only the agentic ChatRWD was able to provide actionable answers when preexisting studies were lacking. Synergistic systems combining RAG-based evidence summarization and agentic generation of novel evidence could improve the availability of pertinent evidence for patient care. 

Read the full manuscript

To learn how Atropos Health can accelerate and supplement your research with Real-World Evidence (RWE), email us: sales@atroposhealth.com