The COVID-19 pandemic has underlined both the need and the challenge of utilizing clinical data to inform national and state public health policymaking.
In a recent study, researchers from the Regenstrief Institute and Indiana University show that machine learning prototypes trained using clinical data from a statewide health information exchange can forecast the probability of hospitalization of individuals with the virus, on a patient level.
It has been quite challenging to bring the bread-and-butter data generated by healthcare systems together with public health decision-making—entities which have long been separate and distinct.
Shaun Grannis, M.D., M.S., Study Senior Author and Vice President for Data and Analytics, Regenstrief Institute
Grannis is also a professor of family medicine at the Indiana University School of Medicine.
Our work shows how you can build and employ AI (artificial intelligence) models to securely utilize the clinical information in a health information exchange to support public health needs such as predicting hospital utilization within one week and within six weeks of onset of COVID infection.
Shaun Grannis, M.D., M.S., Study Senior Author and Vice President for Data and Analytics, Regenstrief Institute
“When new circumstances requiring rapid response arise, such as emergence of omicron or other new variants, once there are sufficient cases to train models, one can confidently access and plug clinical data into these readily available models to make accurate public health predictions and provide valuable insights into patient-level need for healthcare resource utilization,” Dr. Grannis added.
The scientists made use of clinical data from 96,026 individuals spanning all 957 zip codes in Indiana to train decision models that projected healthcare resource utilization.
Since the onset of COVID-19, researchers, healthcare systems, public health departments and others have leveraged existing data repositories and health information infrastructure for rapid analytics. Machine learning has been invaluable in these efforts.
Suranga Kasturi, PhD, Study First Author and Research Scientist, Regenstrief Institute
Kasturi is also an assistant professor of pediatrics at Indiana University School of Medicine.
But any model is only as good as the data that goes into it. The broad, robust data from the Indiana Network for Patient Care is representative of the U.S. population. What we have done could be characterized as a precursor of how AI tools can be deployed across the entire country with the important caveat that whatever models are used should be evaluated for fairness across all subpopulations.
Suranga Kasturi, PhD, Study First Author and Research Scientist, Regenstrief Institute
The Indiana Network for Patient Care (INPC), a regional health information exchange set up by the Regenstrief Institute and run by the Indiana Health Information Exchange (IHIE), is the country’s largest inter-organizational clinical data repository and contains over 14 billion pieces of patient data.
This study has been published in the Journal of Medical Internet Research.
Besides Drs. Grannis and Kasturi, the other authors are Regenstrief Institute research scientists and IU School of Medicine faculty members Babar Khan, M.D., M.S., and David A. Haggstrom, M.D., MAS, and also Jeremy Park, B.S., and David Wild, Ph.D., both of the Luddy School of Informatics, Computing and Engineering at IU-Bloomington.
The research was backed by a Regenstrief Institute COVID-19 research pilot grant and by Indiana University.
Journal Reference:
Kasturi, S., et al. (2021) Predicting COVID-19–Related Health Care Resource Utilization Across a Statewide Patient Population: Model Development Study. Journal of Medical Internet Research. doi.org/10.2196/31337.