Study Warns of Risks from Medical Misinformation in Large Language Models

A study published in Nature is drawing attention to the risks of large language models (LLMs) accidentally spreading medical misinformation. Researchers found that even small amounts of false information in training datasets could lead to harmful outputs that are nearly impossible to distinguish from accurate ones during standard testing.

LLM, AI Large Language Model concept.
Study: Medical large language models are vulnerable to data-poisoning attacks. Image Credit: BOY ANTHONY/Shutterstock.com

To address this issue, the research team proposed using biomedical knowledge graphs to verify and flag problematic outputs, emphasizing the need for transparency and oversight in LLM development—particularly for healthcare applications.

The Problem with LLM Training Data

LLMs, like GPT-4 and LLaMA, are trained on massive datasets sourced from the open Internet, where information quality varies widely. While automated filters can catch overtly offensive content, more subtle misinformation often slips through, especially when it appears credible. This makes these models vulnerable to "data poisoning," a tactic where bad actors intentionally introduce false information into training data.

In fields like healthcare, even minor inaccuracies can have serious consequences. Existing benchmarks for medical LLMs, such as MedQA and PubMedQA, primarily assess performance but fall short when it comes to detecting harmful outputs. Human evaluations, while more effective, are time-intensive and challenging to scale.

To explore these risks, researchers analyzed a popular training dataset called The Pile and simulated how misinformation could impact model performance. They also tested a novel solution: leveraging biomedical knowledge graphs to cross-check outputs for accuracy.

How Researchers Tested for Misinformation

The team focused on three medical domains: general medicine, neurosurgery, and medications. They developed a concept map of 20 key medical terms and their synonyms, then examined datasets like OpenWebText, RefinedWeb, C4, SlimPajama, and The Pile. Among these, The Pile stood out as relatively stable due to its inclusion of curated medical sources like PubMed Central.

To simulate data poisoning, they created fake medical articles using OpenAI’s GPT-3.5-turbo API. These articles contained hidden misinformation and were incorporated into The Pile. The team then trained models with varying degrees of poisoned data and asked blinded reviewers, including physicians and medical students, to evaluate the outputs.

As expected, models trained on poisoned datasets generated significantly more harmful content compared to those trained on clean data. Surprisingly, poisoned models performed just as well as unpoisoned ones on general benchmarks like LAMBADA and MedQA, revealing that these tests often fail to detect misinformation risks.

To combat misinformation, the researchers developed a system that uses biomedical knowledge graphs to verify outputs. This system breaks down medical text into smaller components, called knowledge triplets, and cross-checks them against a trusted knowledge graph. Any unverified triplets are flagged as potential misinformation.

This approach demonstrated near-perfect accuracy, identifying over 90 % of harmful content in the poisoned models. Unlike other verification methods that rely on LLMs themselves, this solution is efficient, interpretable, and doesn’t require advanced hardware or excessive computational resources—making it a practical option for real-time use.

Findings and Analysis

The study revealed several critical insights into the risks and potential solutions surrounding medical misinformation in LLMs. First, web-scale datasets like Common Crawl and OpenWebText were identified as particularly vulnerable to data poisoning due to their lack of oversight. A significant portion of medical content in these datasets was at risk, with Common Crawl standing out as a major contributor to the problem. This highlights the need for stricter curation and monitoring of data sources used for training LLMs.

Even minimal contamination in the training data had a notable impact. The researchers found that as little as 0.001 % of poisoned data could lead to models generating harmful medical misinformation. This demonstrates the outsized influence that even small amounts of false information can have on the reliability of LLM outputs, making the stakes particularly high in domains like healthcare.

The study also underscored the limitations of existing benchmarks. Compromised models performed comparably to unpoisoned ones on general evaluations such as LAMBADA and MedQA, which are not designed to identify misinformation risks. This gap in detection tools highlights the pressing need for benchmarks that prioritize safety and accuracy, especially in high-risk applications.

Finally, biomedical knowledge graphs emerged as a promising solution for mitigating these risks. By breaking down medical content into smaller components and cross-checking them against trusted sources, these graphs were able to detect over 90 % of harmful outputs. Their efficiency and scalability make them a practical tool for improving the reliability of LLMs, offering a pathway to safer applications in critical fields like medicine.

Conclusion

This study highlights the serious risks posed by data-poisoning attacks on LLMs in healthcare. Even subtle misinformation in training datasets can lead to harmful outputs that evade detection by current benchmarks. The findings demonstrate the urgent need for stronger oversight, better evaluation tools, and transparent data practices in LLM development.

The proposed use of biomedical knowledge graphs offers a practical solution to mitigate misinformation risks. By reliably detecting over 90% of harmful content, this approach enhances the safety and reliability of LLMs in medical applications.

For researchers, developers, and policymakers, this study underscores the importance of prioritizing accuracy and patient safety as LLMs become more integrated into sensitive fields like healthcare.

Journal Reference

Alber et al., 2025. Medical large language models are vulnerable to data-poisoning attacks. Nature Medicine. DOI:10.1038/s41591-024-03445-1 https://www.nature.com/articles/s41591-024-03445-1

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Article Revisions

  • Jan 15 2025 - Image replaced from one with a LMS concept to one with a LLM concept.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2025, January 15). Study Warns of Risks from Medical Misinformation in Large Language Models. AZoRobotics. Retrieved on January 21, 2025 from https://www.azorobotics.com/News.aspx?newsID=15632.

  • MLA

    Nandi, Soham. "Study Warns of Risks from Medical Misinformation in Large Language Models". AZoRobotics. 21 January 2025. <https://www.azorobotics.com/News.aspx?newsID=15632>.

  • Chicago

    Nandi, Soham. "Study Warns of Risks from Medical Misinformation in Large Language Models". AZoRobotics. https://www.azorobotics.com/News.aspx?newsID=15632. (accessed January 21, 2025).

  • Harvard

    Nandi, Soham. 2025. Study Warns of Risks from Medical Misinformation in Large Language Models. AZoRobotics, viewed 21 January 2025, https://www.azorobotics.com/News.aspx?newsID=15632.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.