Posted in | News | Consumer Robotics

Improving the Interpretability of Machine Learning Models

Download PDF Copy

Reviewed

Reviewed by Laura ThomsonDec 11 2024

Machine-learning algorithms can make mistakes and be difficult to use, so scientists at the Massachusetts Institute of Technology created explanation methods to assist users in understanding when and how to trust a model’s predictions.

*MIT researchers developed a system that uses large language to convert AI explanations into narrative text that can be more easily understood by users. Image Credit: Jose-Luis Olivares, MIT*

These explanations, however, are frequently complicated, sometimes incorporating information about hundreds of model features. They are also occasionally displayed as complex visuals, making it challenging for people with limited machine-learning knowledge to completely comprehend.

To help people understand AI explanations, MIT researchers used large language models (LLMs) to translate plot-based explanations into plain language.

They created a two-part system that automatically assesses the narrative’s quality after converting a machine-learning explanation into a human-readable paragraph, allowing the end user to decide whether or not to believe it.

Researchers can tailor the system's narrative descriptions to match user preferences or application requirements by prompting it with a few example explanations.

In the long run, the researchers intend to improve this technique by allowing users to ask a model follow-up questions about how it made predictions in real-world circumstances.

Our goal with this research was to take the first step toward allowing users to have full-blown conversations with machine-learning models about the reasons they made certain predictions, so they can make better decisions about whether to listen to the model.

Alexandra Zytek, Study Lead Author and Graduate Student, Massachusetts Institute of Technology

Laure Berti-Équille, a research director at the French National Research Institute for Sustainable Development; Sarah Alnegheimish, a graduate student in EECS; Sara Pido, a postdoc at MIT; and senior author Kalyan Veeramachaneni, a principal research scientist at the Laboratory for Information and Decision Systems, also contributed to the study. The study will be presented at the IEEE Big Data Conference.

Elucidating Explanations

The researchers focused on a prominent type of machine-learning explanation known as SHAP. A value is assigned to each feature the model uses to predict in a SHAP explanation. For example, if a model predicts property values, one of the features could be the house’s location. Location would be assigned a positive or negative value indicating how much the feature affected the model's total prediction.

SHAP explanations are frequently provided as bar plots, indicating which aspects are most or least relevant. However, the bar plot quickly becomes unmanageable for models with more than 100 characteristics.

As researchers, we have to make a lot of choices about what we are going to present visually. If we choose to show only the top 10, people might wonder what happened to another feature that isn’t in the plot. Using natural language unburdens us from having to make those choices.

Kalyan Veeramachaneni, Study Senior Author and Principal Research Scientist, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology

Rather than using a large language model to construct an explanation in natural language, the researchers utilize the LLM to convert an existing SHAP explanation into a readable narrative.

According to Zytek, having the LLM handle only the natural language portion of the procedure reduces the chance of inaccuracies being introduced into the explanation.

Their system, known as EXPLINGO, is divided into two working parts.

The first component, NARRATOR, employs an LLM to generate narrative descriptions of SHAP explanations that satisfy user preferences. The LLM will emulate that style when generating text by initially providing NARRATOR with three to five written examples of narrative explanations.

“Rather than having the user try to define what type of explanation they are looking for, it is easier to just have them write what they want to see,” added Zytek.

This enables NARRATOR to be easily tailored to new use cases by displaying a fresh collection of manually written examples.

After NARRATOR writes a plain-language explanation, the second component, GRADER, uses an LLM to score the narrative on four criteria: conciseness, accuracy, completeness, and fluency. GRADER automatically prompts the LLM with NARRATOR's text and the SHAP explanation it describes.

“We find that, even when an LLM makes a mistake doing a task, it often won’t make a mistake when checking or validating that task,” Zytek stated.

Users can additionally personalize GRADER by assigning various weights to each parameter.

“You could imagine, in a high-stakes case, weighting accuracy and completeness much higher than fluency, for example,” she added.

Analyzing Narratives

One of the most difficult tasks for Zytek and her team was to modify the LLM so that it produced narratives that sounded natural. The more guidelines they put to govern style, the more likely the LLM would make errors in the explanation.

“A lot of prompt tuning went into finding and fixing each mistake one at a time,” she added.

To test their approach, the researchers used nine machine-learning datasets with explanations and had different users write narratives for each one. This lets them assess the NARRATOR's ability to emulate various styles. They utilized GRADER to grade each narrative explanation on all four parameters.

Finally, the researchers discovered that their system could provide high-quality narrative explanations while mimicking various writing styles.

According to their findings, the narrative style is enhanced by including a few manually written explanations of the examples. However, those examples must be prepared carefully because GRADER may classify accurate explanations as erroneous if they contain comparative words like “larger.”

Building on these findings, the researchers intend to investigate strategies that could assist their system in better handling comparative words. They also seek to expand EXPLINGO by including rationalization in the explanations.

In the long run, they intend to use their work as a springboard for an interactive system that allows users to ask model follow-up questions regarding an explanation.

“That would help with decision-making in a lot of ways. If people disagree with a model’s prediction, we want them to be able to quickly figure out if their intuition is correct, or if the model’s intuition is correct, and where that difference is coming from,” Zytek concluded.

Source:

Massachusetts Institute of Technology