Jan 31 2019
A robot is cautiously considering its next move in the basement of MIT’s Building 3. It lightly jabs at a tower of blocks and searches for the best block to pull out without collapsing the tower, in a slow-moving, solitary, but remarkably sprightly game of Jenga.
Designed by MIT engineers, the robot is fitted with a force-sensing wrist cuff, an external camera, and a soft-pronged gripper, all of which it utilizes for viewing and sensing the tower and its separate blocks.
A computer first takes in tactile and visual feedback from the robot’s cuff and camera when the robot cautiously pushes against a block, and then compares these measurements to moves made by the robot before. The consequences of those moves are also considered by the computer, especially whether a block—in a particular configuration and pushed with a specific amount of force—was effectively pulled out or not. In real time, the Jenga-playing robot subsequently “learns” whether to continue to push the block or shift to a new one, thus making sure that the tower does not collapse. This unique robot has been described in the journal Science Robotics.
According to Alberto Rodriguez, the Walter Henry Gale Career Development Assistant Professor at MIT’s Department of Mechanical Engineering, the novel robot reveals something that has been difficult to achieve in earlier systems until now —the potential to rapidly learn the most optimized way to perform a task, not merely from visual cues, as it is often analyzed currently, but also from physical, tactile interactions.
Unlike in more purely cognitive tasks or games such as chess or Go, playing the game of Jenga also requires mastery of physical skills such as probing, pushing, pulling, placing, and aligning pieces. It requires interactive perception and manipulation, where you have to go and touch the tower to learn how and when to move blocks. This is very difficult to simulate, so the robot has to learn in the real world, by interacting with the real Jenga tower. The key challenge is to learn from a relatively small number of experiments by exploiting common sense about objects and physics.
Alberto Rodriguez, the Walter Henry Gale Career Development Assistant Professor, Department of Mechanical Engineering, MIT.
According to Rodriguez, the tactile learning system developed by the investigators can be utilized in applications other than Jenga, particularly in tasks requiring vigilant physical interaction, such as assembling consumer products or separating recyclable items from landfill trash.
“In a cellphone assembly line, in almost every single step, the feeling of a snap-fit, or a threaded screw, is coming from force and touch rather than vision,” Rodriguez said. “Learning models for those actions is prime real-estate for this kind of technology.”
MIT graduate student Nima Fazeli is the study’s lead author. The research team also comprises of Miquel Oller, Zheng Wu, Jiajun Wu, and Joshua Tenenbaum, professor of brain and cognitive sciences at MIT.
Push and pull
In the game of Jenga, which refers to “build” in Swahili, 54 rectangular blocks are arranged in 18 layers of three blocks each, and the blocks in every layer are oriented in a perpendicular line to the blocks stacked below. The goal of the Jenga game is to cautiously pull out a block and position it at the tower top, thus constructing a new level, without collapsing the whole structure.
Programming a robot to play the Jenga game may require conventional machine-learning schemes to capture everything that may occur between the robot, a block, and the tower—a costly computational task that needs data from countless numbers of block-extraction attempts.
Hence, motivated by human cognition and the way people themselves may approach the Jenga game, Rodriguez and his coworkers searched for a more data-efficient method to program a robot to learn to play the game.
The researchers first modified an industry-standard ABB IRB 120 robotic arm, then arranged a Jenga tower within the reach of the robot, and finally started a training period wherein the robot first selects a haphazard block as well as a site on the block against which to push, and then applies a negligible amount of force in an effort to extract the block from the tower. A computer records the related visual and force measurements for each block attempt and subsequently labels whether each block attempt was successful or not.
Instead of doing a countless number of such attempts—in which the tower has to be rebuilt nearly as many times—the robot was trained on just approximately 300, with attempts of comparable measurements and results grouped in clusters denoting specific block behaviors. For example, one data cluster may indicate attempts on a hard-to-move block, against an easier-to-move one, or the one that collapsed the tower upon moving. For each cluster of data, the robot created an easy model to predict the behavior of a block given its present tactile and visual measurements.
According to Fazeli, this clustering method considerably boosts the efficiency through which the robot can learn to play the Jenga game, and is motivated by the natural way in which similar behavior is clustered by humans: “The robot builds clusters and then learns models for each of these clusters, instead of learning a model that captures absolutely everything that could happen.”
Stacking up
The method was tested against other sophisticated machine learning algorithms, in a computer simulation of the Jenga game utilizing the simulator MuJoCo. Through the lessons learned in the simulator MuJoCo, the team came to know the way the robot would actually learn in the real world.
“We provide to these algorithms the same information our system gets, to see how they learn to play Jenga at a similar level,” Oller stated. “Compared with our approach, these algorithms need to explore orders of magnitude more towers to learn the game.”
The researchers were curious to know how their machine-learning method stacks up against real human players, and eventually performed some informal trials with a number of volunteers.
“We saw how many blocks a human was able to extract before the tower fell, and the difference was not that much,” Oller stated.
However, if the team wants to competitively pit its robot against a human player, there is still a long way to go. Apart from physical interactions, Jenga needs a strategy, like extracting only the right block that will make it hard for an opponent to extract the next block without collapsing the tower.
At present, the researchers are not too keen on designing a robotic Jenga champion, and instead, are focusing more on applying the new skills of the robot to other application domains.
There are many tasks that we do with our hands where the feeling of doing it ‘the right way’ comes in the language of forces and tactile cues. For tasks like these, a similar approach to ours could figure it out.
Alberto Rodriguez, the Walter Henry Gale Career Development Assistant Professor, Department of Mechanical Engineering, MIT.
The study was partly supported by the National Science Foundation via the National Robotics Initiative.
MIT Robot Learns How to Play Jenga
(Video credit: MIT)