New AI Algorithm Enables Faster, More Reliable Learning

Recent work by Northwestern Engineering researchers unveils a novel artificial intelligence (AI) algorithm tailored for smart robotics. These findings were published in Nature Machine Intelligence. This new method aims to enhance the practicality and safety of robots across various applications, such as self-driving cars, delivery drones, household assistants, and automation, by facilitating rapid and dependable learning of complex skills.

New AI Algorithm Enables Faster, More Reliable Learning
Todd Murphey. Image Credit: Northwestern University

The algorithm known as MaxDiff RL is successful because it motivates robots to investigate their surroundings as haphazardly as possible to acquire a variety of experiences. The quality of the information that robots gather about their immediate environment is enhanced by this “designed randomness.”

Simulated robots showed faster and more effective learning using higher-quality data, enhancing their overall performance and reliability.

Northwestern’s new algorithm produced consistently better-simulated robots than state-of-the-art models when tested against other AI platforms.

The efficacy of the new algorithm is notable, as robots swiftly acquire new tasks and execute them successfully on the first attempt, demonstrating a stark departure from current AI models that rely on slower trial-and-error learning methods.

 Other AI frameworks can be somewhat unreliable. Sometimes, they will totally nail a task, but other times, they will fail completely. With our framework, as long as the robot is capable of solving the task at all, every time you turn on your robot, you can expect it to do exactly what it’s been asked to do. This makes it easier to interpret robot successes and failures, which is crucial in a world increasingly dependent on AI.

Thomas Berrueta, Study Lead, Northwestern University

Berrueta is a Mechanical Engineering Ph.D. candidate at the McCormick School of Engineering and a Presidential Fellow at Northwestern. The paper’s senior author is robotics specialist Todd Murphey, a Professor of Mechanical Engineering at McCormick and Berrueta's advisor. Allison Pinosky, a Ph.D. candidate in Murphey's lab, co-authored the paper with Berrueta and Murphey.

The Disembodied Disconnect

Scientists, developers, and researchers use vast amounts of human-curated and filtered big data to train machine-learning algorithms. Through trial and error, AI gains knowledge from this training set and eventually achieves optimal outcomes. This method does not work for embodied AI systems, such as robots, but works well for disembodied systems like ChatGPT and Google Gemini (formerly Bard). Instead, robots gather data on their own without the assistance of human curators.

Traditional algorithms are not compatible with robotics in two distinct ways. First, disembodied systems can take advantage of a world where physical laws do not apply. Second, individual failures have no consequences. For computer science applications, the only thing that matters is that it succeeds most of the time. In robotics, one failure could be catastrophic.

Todd Murphey, Professor, Robotics Specialist and Study Senior Author, Department of Mechanical Engineering, McCormick School of Engineering

Murphey is the Advisor of Berrueta.

Berrueta, Murphey, and Pinosky set out to create a novel algorithm that guarantees robots will gather high-quality data while they are in motion to bridge this gap. Fundamentally, MaxDiff RL instructs robots to move more randomly to gather comprehensive, varied data about their surroundings. Robots learn through self-selected random experiences, gaining the necessary abilities to perform practical tasks.

Getting it Right the First Time

The researchers tested the new algorithm by comparing it with the most advanced models available at the time. The researchers trained simulated robots to carry out several routine tasks using computer simulations. Robots using MaxDiff RL generally picked up new skills more quickly than those using other models. They also completed tasks far more consistently and dependably than others.

Even more remarkably, robots utilizing the MaxDiff RL method frequently achieved accurate task execution in just one attempt, even when commencing with no prior knowledge.

Our robots were faster and more agile—capable of effectively generalizing what they learned and applying it to new situations. For real-world applications where robots can’t afford endless time for trial and error, this is a huge benefit.

Berrueta, Ph.D. Candidate and Presidential Fellow, Department of Mechanical Engineering, McCormick School of Engineering

MaxDiff RL is a general algorithm, suitable for many different applications. The researchers hope it will allow trustworthy decision-making in smart robotics by addressing fundamental problems impeding the field.

Allison Pinosky adds, “This doesn’t have to be used only for robotic vehicles that move around. It also could be used for stationary robots—such as a robotic arm in a kitchen that learns how to load the dishwasher. As tasks and physical environments become more complicated, the role of embodiment becomes even more crucial to consider during the learning process. This is an important step toward real systems that do more complicated, more interesting tasks.”

Single Swimmer Fixed

A simulation of a single robot testing the new AI algorithm. Video Credit: Northwestern University.

Journal Reference:

Berrueta, A. T., et al. (2024) Maximum diffusion reinforcement learning. Nature Machine Intelligence. doi.org/10.1038/s42256-024-00829-3.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.