Recent research published in Radiology, carried out by researchers in Denmark, found that a commercial AI tool used off-label was effective at ruling out pathology and had crucial miss rates on chest X-rays that were either equivalent to or lower than radiologists.
Due to recent advancements in AI, a growing number of people are interested in computer-assisted diagnosis. These factors include the burden that radiology departments are facing, the lack of radiologists worldwide, and the risk of burnout in the industry. Because many ordinary chest X-rays are taken in radiology offices, AI might be able to streamline workflow by generating an automated report.
Danish researchers sought to estimate the percentage of unremarkable chest X-rays in which AI could accurately rule out pathology without increasing diagnostic errors. The study used information from four Danish hospitals' radiology reports and data from 1,961 individuals (median age, 72 years; 993 female), with one chest X-ray per patient.
Our group and others have previously shown that AI tools are capable of excluding pathology in chest X-rays with high confidence and thereby provide an autonomous normal report without a human in-the-loop. Such AI algorithms miss very few abnormal chest radiographs. However, before our current study, we didn’t know what the appropriate threshold was for these models.
Louis Lind Plesner, Study Lead Author, Department of Radiology, Herlev and Gentofte Hospital
The study team sought to determine whether errors generated by radiologists and artificial intelligence (AI) differed in quality and whether, on the whole, AI errors were objectively worse than human errors.
Additional Work Needed Before Widespread Deployment
Modifying the AI tool produced the chest X-ray “remarkableness” probability, which was then utilized to compute specificity at various AI sensitivity levels.
Based on predetermined unremarkable findings, two chest radiologists who were blind to the AI output classified the chest X-rays as “remarkable” or “unremarkable.” One chest radiologist, who was blind to whether the AI or the radiologist caused the error, classified chest X-rays with missing results by the AI and/or the radiology report as critical, clinically significant, or clinically minor.
The reference standard classified 1,231 of 1,961 chest X-rays (62.8%) as noteworthy, while 730 of 1,961 (37.2%) were classified as unimpressive. At more than or equal to 98% sensitivity, the AI tool accurately excluded pathology in 24.5% to 52.7% of unremarkable chest X-rays, with lower rates of critical misses than those reported in the radiology reports linked to the pictures.
According to Dr. Plesner, errors caused by AI were typically more clinically serious for the patient than errors made by radiologists.
This is likely because radiologists interpret findings based on the clinical scenario, which AI does not. Therefore, when AI is intended to provide an automated normal report, it has to be more sensitive than the radiologist to avoid decreasing standard of care during implementation. This finding is also generally interesting in this era of AI capabilities covering multiple high-stakes environments not only limited to health care.
Louis Lind Plesner, Study Lead Author, Department of Radiology, Herlev and Gentofte Hospital
Dr. Plesner claims that AI could report over half of all normal chest X-rays independently.
“In our hospital-based study population, this meant that more than 20% of all chest X-rays could have been potentially autonomously reported using this methodology while keeping a lower rate of clinically relevant errors than the current standard.”
Before recommending a broad deployment, Dr. Plesner pointed out that the model must be prospectively implemented using one of the study's suggested thresholds.
Journal Reference:
Plesner, L. L., et al. (2024) Using AI to Identify Unremarkable Chest Radiographs for Automatic Reporting. Radiology. doi.org/10.1148/radiol.240272