May 24 2019
MIT researchers have developed a unique system that utilizes only simple visual data and maps to enable autonomous cars to traverse routes in new, intricate environments. The aim of this study is to bring more human-like reasoning to driverless vehicles.
With the help of simple tools and observation, human drivers can incredibly navigate routes they have not driven on before. Humans merely match what they see around them to what they see on their GPS devices to establish where they are and where they have to go. However, autonomous vehicles tend to struggle with this simple reasoning. In each new area, the vehicles should initially map and examine all the new routes, which can indeed take a long time. Moreover, the systems depend on intricate maps—often produced by 3D scans—which can be computationally demanding to create and process on the go.
Now, in a recent paper being presented at the International Conference on Robotics and Automation, the MIT team detailed an advanced autonomous control system that can “learn” the steering patterns of human drivers as they traverse the roads in a tiny area, using just a simple GPS-like map and data from video camera feeds. The trained system can then control an autonomous vehicle along an already defined route in a brand-new area, by mimicking the human driver.
Just like human drivers, the new system is also able to detect any kind of mismatches between its map and the road features. This allows the system to establish whether its sensors, mapping, or position are inaccurate, so that it can eventually correct the car’s course.
In order to train this system, a human operator initially handled an automated Toyota Prius—fitted with a basic GPS navigation system and a number of cameras—to obtain data from local suburban streets including numerous obstacles and road structures. When the system was autonomously deployed, it effectively navigated the vehicle along an already planned route in a different forested region, marked for driverless vehicle tests.
With our system, you don’t need to train on every road beforehand. You can download a new map for the car to navigate through roads it has never seen before.
Alexander Amini, Study First Author and Graduate Student, MIT
Our objective is to achieve autonomous navigation that is robust for driving in new environments. For example, if we train an autonomous vehicle to drive in an urban setting such as the streets of Cambridge, the system should also be able to drive smoothly in the woods, even if that is an environment it has never seen before.
Daniela Rus, Study Co-Author and Director, Computer Science and Artificial Intelligence Laboratory (CSAIL), MIT
Rus is also the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science.
Amini and Rus are joined by Sertac Karaman, an associate professor of aeronautics and astronautics at MIT, and Guy Rosman, a researcher at the Toyota Research Institute, who also contributed to the paper.
Point-to-point navigation
Conventional navigation systems are capable of processing data from sensors via various modules custom-made for tasks like mapping, localization, steering control, motion planning, and object detection. Rus’ team has long been constructing “end-to-end” navigation systems, which have the ability to process inputted sensory data and output steering commands, without any necessity for dedicated modules.
But, until now, these models were rigorously developed to safely navigate the road, with no actual destination in mind. Now, in the latest study, the scientists developed their end-to-end system to move from goal to destination, in an earlier unseen environment. To achieve this, the scientists trained their system to foretell a complete probability distribution across all potential steering commands at any specified instant while driving.
A machine learning model known as a convolutional neural network (CNN) is used by the unique system. The CNN is often employed for image recognition. At the time of training, the system watches and observes how to navigate from a human driver. The CNN associates the rotations of the steering wheels to the curvatures of the road it views through an inputted map and cameras. Ultimately, it learns the most probable steering command for many different driving situations, like T-shaped or four-way intersections, straight roads, rotaries, and forks.
Initially, at a T-shaped intersection, there are many different directions the car could turn. The model starts by thinking about all those directions, but as it sees more and more data about what people do, it will see that some people turn left and some turn right, but nobody goes straight. Straight ahead is ruled out as a possible direction, and the model learns that, at T-shaped intersections, it can only move left or right.
Daniela Rus, Study Co-Author and Director, Computer Science and Artificial Intelligence Laboratory (CSAIL), MIT
What does the map say?
While testing, the team feeds the novel system with a map with an arbitrarily selected route. While the system is driving, it is able to extract visual features from the camera, allowing it to foretell the structures of roads. For example, it can detect line breaks or a remote stop sign on the road side as indications of an upcoming intersection. Moreover, at every moment, the system utilizes its predicted probability distribution of steering commands to select the most possible one to track its route.
Most significantly, the system utilizes maps that can be stored and processed easily, say the researchers. Usually, autonomous control systems employ LIDAR scans to produce huge, intricate maps that take about 4,000 gigabytes or 4 terabytes, of data for storing only the city of San Francisco. For each new destination, the car has to generate new maps, amounting to tons of data processing. By contrast, the maps utilized by the new system can capture the whole world using only 40 gigabytes of data.
In addition, during autonomous driving, the system constantly matches its visual data to the map data and subsequently records any kind of mismatch. This would allow the driverless vehicle to better establish its exact position on the road. The system also makes sure that the vehicle remains on the safest path in case if it is being fed with contradictory input data—for example, if the vehicle is traveling on a straight road without any turns, and the GPS shows it must turn right, the vehicle will know whether it should stop or continue driving straight.
In the real world, sensors do fail. We want to make sure that the system is robust to different failures of different sensors by building a system that can accept these noisy inputs and still navigate and localize itself correctly on the road.
Alexander Amini, Study First Author and Graduate Student, MIT
Variational End-to-End Navigation and Localization