Posted in | News | Medical Robotics

Reinforcement Learning for Personalized Treatment Plans

Download PDF Copy

Reviewed

Reviewed by Laura ThomsonDec 18 2024

According to a recent study, published in the Proceedings of the Conference on Neural Information Processing Systems (NeurIPS) and presented on December 13^th, 2024 by researchers from Weill Cornell Medicine and Rockefeller University, reinforcement learning, an artificial intelligence technique, has the potential to help physicians create sequential treatment plans for better patient outcomes, but it needs major advancements before it can be used in clinical settings.

Reinforcement Learning (RL) is a type of machine learning algorithm capable of making a series of judgments over time. RL, which is responsible for recent AI achievements such as superhuman performance at chess and Go, can use changing medical conditions, test results, and previous treatment responses to recommend the next best step in customized patient care. This method is especially promising for making decisions about managing chronic or psychiatric diseases.

The study introduces “Episodes of Care” (EpiCare), the first RL benchmark for healthcare.

Benchmarks have driven improvement across machine learning applications including computer vision, natural language processing, speech recognition and self-driving cars. We hope they will now push RL progress in healthcare.

Dr. Logan Grosenick, Assistant Professor, Weill Cornell Medicine

RL agents adjust their actions based on feedback, gradually learning a policy that improves their decision-making.

“However, our findings show that while current methods are promising, they are exceedingly data hungry,” Dr. Grosenick added.

The researchers initially evaluated the performance of five innovative online RL models using EpiCare. All five outperformed the standard-of-care baseline, but only after training on thousands or tens of thousands of realistic simulated treatment events.

In the real world, RL techniques would never be trained directly on patients, thus the authors then assessed five standard “off-policy evaluation” (OPE) methods: popular approaches that try to use previous data (such as from clinical trials) to avoid the necessity for online data gathering. Using EpiCare, they discovered that modern OPE approaches consistently failed to appropriately represent health care data.

Our findings indicate that current state-of-the-art OPE methods cannot be trusted to accurately predict reinforcement learning performance in longitudinal health care scenarios.

Dr. Mason Hargrave, Research Fellow, The Rockefeller University

This study emphasizes the necessity of creating more precise benchmarking tools, such as EpiCare, to audit current RL approaches and offer metrics for gauging improvement, as OPE methods have been explored more and more for health care applications.

“We hope this work will facilitate more reliable assessment of reinforcement learning in health care settings and help accelerate the development of better RL algorithms and training protocols appropriate for medical applications,” stated Dr. Grosenick.

Adapting Convolutional Neural Networks to Interpret Graph Data

Convolutional neural networks (CNNs), which are commonly used to interpret images, can be modified to function for more generic graph-structured data, such as brain, gene, or protein networks, according to research presented by Dr. Grosenick in a second NeurIPS study that same day.

The foundation for “deep learning” using CNNs and the current era of neural-network-driven AI applications was established by the widespread success of CNNs for image recognition tasks in the early 2010s. Applications for CNNs are numerous and include medical image analysis, self-driving cars, and facial recognition.

Dr. Grosenick further added, “We are often interested in analyzing neuroimaging data which are more like graphs, with vertices and edges, than like images. But we realized that there wasn't anything available that was truly equivalent to CNNs and deep CNNs for graph-structured data.”

Brain networks are often depicted as graphs, with brain areas (shown as vertices) propagating information to other brain regions (vertices) via “edges” that connect and express the strength between them. This also applies to gene and protein networks, human and animal behavioral data, and the geometry of chemical compounds like drugs. By examining such graphs directly, researchers can better model dependencies and patterns between local and distant links.

Isaac Osafo Nkansah, a research associate in the Grosenick lab at the time of the study and the study’s first author, contributed to the development of the Quantized Graph Convolutional Networks (QuantNets) framework, which generalizes CNNs to graphs.

“We are now using it for modeling EEG (electrical brain activity) data in patients. We can have a net of 256 sensors over the scalp taking readings of neuronal activity—that is a graph. We are taking those large graphs and reducing them down to more interpretable components to better understand how dynamic brain connectivity changes as patients undergo treatment for depression or obsessive-compulsive disorder,” Dr. Grosenick added.

The researchers believe QuantNets will have a wide range of applications. For example, they want to model graph-structured posture data to track mouse behavior and human facial expressions retrieved by computer vision.

Dr. Grosenick concluded, “While we are still navigating the safety and complexity of applying cutting-edge AI methods to patient care, every step forward—whether it is a new benchmarking framework or a more accurate model—brings us incrementally closer to personalized treatment strategies that have the potential to profoundly improve patient health outcomes.”

Source:

Weill Cornell Medicine