Reviewed by Lexie CornerMar 1 2024
Clinicians and data scientists have collaborated to create innovative algorithms that have the potential to customize treatment plans.
UC Davis Health clinicians and data scientists have collaborated to create an advanced machine-learning model that enhances the accuracy of predicting the likelihood of hepatocellular carcinoma (HCC), a prevalent form of liver cancer, in patients.
The outcomes of their study, which were published in the Gastro Hep Advances journal, elucidate the potential of predictive learning to assist healthcare professionals in delivering early risk assessments for patients diagnosed with metabolic dysfunction-associated steatotic liver disease (MASLD).
This pioneering technology can equip physicians with vital information, enabling them to conduct thorough screenings and provide tailored patient care.
MASLD can lead to HCC, but the disease is quite sneaky, and it’s often unclear which patients face that risk. It doesn’t make sense to biopsy every patient with MASLD, but if we can segment for risk, we can track those people more closely and perhaps catch HCC early.
Aniket Alurwar, Clinical Informatics Specialist, Center for Precision Medicine and Data Sciences, UC Davis
Diagnosing a Stealthy Condition
MASLD, previously known as nonalcoholic fatty liver disease or NAFLD, is a condition frequently associated with metabolic disorders like type 2 diabetes, characterized by the buildup of fat in the liver. Approximately a quarter of the American population is affected by some type of MASLD, rendering it a prevalent liver ailment.
The clinicians collaborated closely with the data science team, comprising Souvik Sarkar, an Assistant Professor in Gastroenterology and Hepatology, as the first author, and Frederick Meyers, a Distinguished Professor of Internal Medicine, Hematology and Oncology, as the senior author. Meyers also serves as the Director of the Center for Precision Medicine and Data Sciences.
This research represents a pioneering effort in its field. Scientists instructed machine-learning algorithms, utilizing extensive datasets to produce confirmable forecasts.
Nine open-source algorithms were tested, and five were selected for further evaluation and model building. These five algorithms were trained using de-identified health data from 1,561 UC Davis Health MASLD patients, 227 of whom developed HCC. Subsequently, the top five algorithms were validated using data from 686 UC San Francisco patients, with 176 receiving an HCC diagnosis. The prediction model with the highest statistical accuracy, sensitivity, and specificity was developed using an algorithm known as Gradient Boosted Trees.
The research has verified that advanced liver fibrosis or scarring, indicated by high Fibrosis-4 Index (FIB-4) scores, is one of the most dependable indicators for the risk of HCC. The scientists have also identified four other risk factors related to liver function: high cholesterol, hypertension, bilirubin, and alkaline phosphatase (ALP), an enzyme that can serve as an indication of liver issues. Combining these risk factors into a single model made it possible to predict the risk of HCC.
AI Shows High Accuracy
The team discovered various routes to HCC, with elevated FIB-4 levels being the most apparent. Interestingly, some patients with low FIB-4 levels but high cholesterol, bilirubin, and hypertension also ended up developing HCC. According to existing protocols, these individuals would not be eligible for preventive measures.
We got 92.12 % accuracy when predicting which MASLD patients would develop HCC, which is very good for a pilot model. Patients with low FIB-4 are typically considered low-risk and do not get referred for further assessment. By showing which of these ‘low-risk’ patients could develop HCC, we can get them referred for liver biopsies or imaging.
Aniket Alurwar, Clinical Informatics Specialist, Center for Precision Medicine and Data Sciences, UC Davis
The team takes pride in their model, but researchers aim to enhance its accuracy by integrating more detailed data like clinical notes. This will involve utilizing natural language processing—AI that converts written text into data. The team also plans to experiment with Bedrock, Amazon’s generative AI platform. In the future, a comparable model may be integrated into electronic health records or a different platform to alert clinicians of MASLD patients at higher risk of HCC.
We believe we can improve the algorithm by incorporating the clinical notes and perhaps other information. Embedding this data should create an even more powerful model that we can then test to see how it performs.
Aniket Alurwar, Clinical Informatics Specialist, Center for Precision Medicine and Data Sciences, UC Davis
Journal Reference:
Sarkar, S., et al. (2024) A Machine Learning Model to Predict Risk for Hepatocellular Carcinoma in Patients with Metabolic Dysfunction-Associated Steatotic Liver Disease. Science Direct. doi.org/10.1016/j.gastha.2024.01.007.