Jan 25 2019
Researchers at MIT and Microsoft have developed an innovative model with the ability to identify instances where autonomous systems have “learned” from training examples that do not exactly match with the actual occurrences in the real world.
This model could be used by engineers to enhance the safety of artificial intelligence systems like autonomous robots and driverless vehicles.
For instance, the AI systems that power driverless cars are thoroughly trained using virtual simulations to make the vehicle ready for almost any event on the road. However, at times, the car makes an unanticipated error in the real world as a result of an event that must, but does not, modify the behavior of the car.
Take into account an untrained driverless car, and more specifically one that does not have the necessary sensors to differentiate between clearly distinct scenarios, such as large, white cars and ambulances with red, flashing lights on the road. In the even of the car cruising down the highway and an ambulance flicking on its sirens, the car may not know to pull over or slow down since it does not sense the ambulance as different from a big white car.
In a couple of papers—presented at the Autonomous Agents and Multiagent Systems conference last year and to be presented at the upcoming Association for the Advancement of Artificial Intelligence conference—the scientists report a model that uncovers these training “blind spots” using human input.
Similar to conventional strategies, the scientists made an AI system to undergo simulation training. However, the actions of the system are closely monitored by a human when it acts in the real world, offering feedback when the system made, or was about to make, any mistakes. Subsequently, the researchers merge the training data with the human feedback data and employe machine-learning methods to generate a model that identifies the situations where the system most probably requires more information on ways to act accurately.
The scientists used video games to validate their technique, where a simulated human corrected the learned path of an on-screen character. However, the next measure would be to incorporate the model with conventional training and testing strategies for autonomous robots and cars with human feedback.
The model helps autonomous systems better know what they don’t know. Many times, when these systems are deployed, their trained simulations don’t match the real-world setting [and] they could make mistakes, such as getting into accidents. The idea is to use humans to bridge that gap between simulation and the real world, in a safe way, so we can reduce some of those errors.
Ramya Ramakrishnan, Graduate Student, Computer Science and Artificial Intelligence Laboratory, MIT.
Ramakrishnan is the first author of the study.
Julie Shah, an associate professor in the Department of Aeronautics and Astronautics and head of the CSAIL’s Interactive Robotics Group, and Ece Kamar, Debadeepta Dey, and Eric Horvitz, all from Microsoft Research are the co-authors of both the papers. Besmira Nushi is an additional co-author on the upcoming paper.
Taking Feedback
Although certain conventional training techniques offer human feedback at the time of real-world test runs, it is only to update the actions of the system. These strategies do not recognize blind spots, which could be beneficial for safer real-world execution.
The approach of the researchers involves first making an AI system to undergo simulation training, where it will generate a “policy” that typically maps every circumstance to the best action it can take in the simulations. Subsequently, the system will be deployed in the real world, where humans offer error signals in regions in which the actions of the system are objectionable.
Data can be provided by humans in several ways, for example, through “demonstrations” and “corrections.” In demonstrations, the system observes the acts of the human in the real world and compares the actions to what it would have executed in that situation. In the case of driverless cars, for example, the car would be manually controlled by a human while the system generates a signal if there is a deviation in its planned behavior from the human’s behavior. Noisy indications are provided by matches and mismatches with the actions of the human to indicate where the system might be acting acceptably or unacceptably.
As another option, the human can provide corrections and monitor the actions of the system in the real world. While the autonomous car drives on its own along its intended route, a human could sit in the driver’s seat. The human does nothing if the actions of the car are correct. On the other hand, if the actions of the car are incorrect, the human may take charge, thereby sending a signal that the system was acting unacceptably in that particular situation.
Upon compiling the feedback data from the human, the system typically has a list of situations and, for every situation, a number of labels indicating its actions were acceptable or objectionable. Several different signals could be given for a single situation since the system understands various situations to be identical. For instance, an autonomous car might have cruised beside a large car a number of times without pulling over and slowing down. However, an ambulance, which appears precisely the same to the system, cruises by in only one instance. The autonomous car does not pull over and obtains a feedback signal that the system took an objectionable action.
At that point, the system has been given multiple contradictory signals from a human: some with a large car beside it, and it was doing fine, and one where there was an ambulance in the same exact location, but that wasn’t fine. The system makes a little note that it did something wrong, but it doesn’t know why. Because the agent is getting all these contradictory signals, the next step is compiling the information to ask, ‘How likely am I to make a mistake in this situation where I received these mixed signals?’
Ramya Ramakrishnan, Graduate Student, Computer Science and Artificial Intelligence Laboratory, MIT.
Intelligent Aggregation
The ultimate aim is to label these ambiguous situations as blind spots. However, that is far more than just tallying the acceptable and objectionable actions for every situation. In case the system executes precise actions 9 times out of 10 in the ambulance situation, for example, that situation would be labeled as safe by a simple majority vote.
“But because unacceptable actions are far rarer than acceptable actions, the system will eventually learn to predict all situations as safe, which can be extremely dangerous,” stated Ramakrishnan.
For this reason, the scientists used the Dawid-Skene algorithm, a machine-learning technique commonly used for crowdsourcing to deal with label noise. The algorithm takes a list of situations as input, where each situation has a set of noisy “acceptable” and “unacceptable” labels. Subsequently, it accumulates all the data and employs certain probability calculations to recognize patterns in the labels for predicted safe situations and patterns for predicted blind spots. It uses this information to output a single aggregated “safe” or “blind spot” label for every situation together with its confidence level in that label. Specifically, the algorithm has the ability to learn that even in a situation where it may have, for example, executed acceptably 90% of the time, the situation is still sufficiently ambiguous to merit a “blind spot.”
At last, the algorithm generates a kind of “heat map” in which each situation from the original training of the system is assigned low-to-high probability of being a blind spot for the system.
When the system is deployed into the real world, it can use this learned model to act more cautiously and intelligently. If the learned model predicts a state to be a blind spot with high probability, the system can query a human for the acceptable action, allowing for safer execution.
Ramya Ramakrishnan, Graduate Student, Computer Science and Artificial Intelligence Laboratory, MIT.