A study published in Nature is drawing attention to the risks of large language models (LLMs) accidentally spreading medical misinformation. Researchers found that even small amounts of false information in training datasets could lead to harmful outputs that are nearly impossible to distinguish from accurate ones during standard testing.
To address this issue, the research team proposed using biomedical knowledge graphs to verify and flag problematic outputs, emphasizing the need for transparency and oversight in LLM development—particularly for healthcare applications.
The Problem with LLM Training Data
LLMs, like GPT-4 and LLaMA, are trained on massive datasets sourced from the open Internet, where information quality varies widely. While automated filters can catch overtly offensive content, more subtle misinformation often slips through, especially when it appears credible. This makes these models vulnerable to "data poisoning," a tactic where bad actors intentionally introduce false information into training data.
In fields like healthcare, even minor inaccuracies can have serious consequences. Existing benchmarks for medical LLMs, such as MedQA and PubMedQA, primarily assess performance but fall short when it comes to detecting harmful outputs. Human evaluations, while more effective, are time-intensive and challenging to scale.
To explore these risks, researchers analyzed a popular training dataset called The Pile and simulated how misinformation could impact model performance. They also tested a novel solution: leveraging biomedical knowledge graphs to cross-check outputs for accuracy.
How Researchers Tested for Misinformation
The team focused on three medical domains: general medicine, neurosurgery, and medications. They developed a concept map of 20 key medical terms and their synonyms, then examined datasets like OpenWebText, RefinedWeb, C4, SlimPajama, and The Pile. Among these, The Pile stood out as relatively stable due to its inclusion of curated medical sources like PubMed Central.
To simulate data poisoning, they created fake medical articles using OpenAI’s GPT-3.5-turbo API. These articles contained hidden misinformation and were incorporated into The Pile. The team then trained models with varying degrees of poisoned data and asked blinded reviewers, including physicians and medical students, to evaluate the outputs.
As expected, models trained on poisoned datasets generated significantly more harmful content compared to those trained on clean data. Surprisingly, poisoned models performed just as well as unpoisoned ones on general benchmarks like LAMBADA and MedQA, revealing that these tests often fail to detect misinformation risks.
To combat misinformation, the researchers developed a system that uses biomedical knowledge graphs to verify outputs. This system breaks down medical text into smaller components, called knowledge triplets, and cross-checks them against a trusted knowledge graph. Any unverified triplets are flagged as potential misinformation.
This approach demonstrated near-perfect accuracy, identifying over 90 % of harmful content in the poisoned models. Unlike other verification methods that rely on LLMs themselves, this solution is efficient, interpretable, and doesn’t require advanced hardware or excessive computational resources—making it a practical option for real-time use.
Findings and Analysis
The study revealed several critical insights into the risks and potential solutions surrounding medical misinformation in LLMs. First, web-scale datasets like Common Crawl and OpenWebText were identified as particularly vulnerable to data poisoning due to their lack of oversight. A significant portion of medical content in these datasets was at risk, with Common Crawl standing out as a major contributor to the problem. This highlights the need for stricter curation and monitoring of data sources used for training LLMs.
Even minimal contamination in the training data had a notable impact. The researchers found that as little as 0.001 % of poisoned data could lead to models generating harmful medical misinformation. This demonstrates the outsized influence that even small amounts of false information can have on the reliability of LLM outputs, making the stakes particularly high in domains like healthcare.
The study also underscored the limitations of existing benchmarks. Compromised models performed comparably to unpoisoned ones on general evaluations such as LAMBADA and MedQA, which are not designed to identify misinformation risks. This gap in detection tools highlights the pressing need for benchmarks that prioritize safety and accuracy, especially in high-risk applications.
Finally, biomedical knowledge graphs emerged as a promising solution for mitigating these risks. By breaking down medical content into smaller components and cross-checking them against trusted sources, these graphs were able to detect over 90 % of harmful outputs. Their efficiency and scalability make them a practical tool for improving the reliability of LLMs, offering a pathway to safer applications in critical fields like medicine.
Conclusion
This study highlights the serious risks posed by data-poisoning attacks on LLMs in healthcare. Even subtle misinformation in training datasets can lead to harmful outputs that evade detection by current benchmarks. The findings demonstrate the urgent need for stronger oversight, better evaluation tools, and transparent data practices in LLM development.
The proposed use of biomedical knowledge graphs offers a practical solution to mitigate misinformation risks. By reliably detecting over 90% of harmful content, this approach enhances the safety and reliability of LLMs in medical applications.
For researchers, developers, and policymakers, this study underscores the importance of prioritizing accuracy and patient safety as LLMs become more integrated into sensitive fields like healthcare.
Journal Reference
Alber et al., 2025. Medical large language models are vulnerable to data-poisoning attacks. Nature Medicine. DOI:10.1038/s41591-024-03445-1 https://www.nature.com/articles/s41591-024-03445-1
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.
Article Revisions
- Jan 15 2025 - Image replaced from one with a LMS concept to one with a LLM concept.