Using images or videos, these AI systems can combine simulations that train robots to operate in physical spaces.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Researchers working on large artificial intelligence models like ChatGPT have a vast collection of Internet text, images, and videos to train systems with. But roboticists face hurdles in training physical machines: Robot data is expensive, and because the world doesn't have large-scale fleets of robots, not enough data is readily available to perform well in dynamic environments. can demonstrate, such as people's homes. .

Some researchers have turned to simulation to train robots. Yet the process, which often involves a graphic designer or engineer, is laborious and expensive.

Two new studies by researchers at the University of Washington have introduced AI systems that use video or images to create simulations that can train robots to perform tasks in real settings. This can significantly reduce the costs of training robots to operate in complex settings.

In the first study, a user rapidly scans a space with a smartphone to record its geometry. The system, called RialTo, can then create a “digital twin” simulation of the space, where the user can record how different things work (opening a drawer, for example). A robot can then repeat the simulated movements in practice with minor variations to learn to perform them effectively. In another study, the team created a system called URDFormer, which takes images of real environments from the Internet and rapidly creates physically realistic simulation environments where robots can train.

The teams presented their study — the first on July 16 and the second on July 19 — at the Robotics Science and Systems Conference in Delft, the Netherlands.

“We're trying to enable systems that go from the real world to simulation cheaply,” said Abhishek Gupta, a UW assistant professor in the Paul G. Allen School of Computer Science and Engineering and co-senior author of both papers. “Systems can then train the robots in these simulated scenes, so the robots can operate more effectively in the physical space. This is useful for safety — you can have poorly trained robots breaking things and killing people.” Can't hurt — and it potentially broadens access. If you can get a robot to work in your home by scanning it with your phone, it democratizes technology.”

While many robots are currently well-suited to work in environments such as assembly lines, teaching them to interact with people and in less structured environments is a challenge.

“In a factory, for example, there's a ton of repetition,” said lead author of the URDFormer study Zoe Chen, a UW doctoral student at the Allen School. “Work can be difficult, but once you program a robot, it can keep doing things over and over again. While homes are unique and constantly changing. There are objects, tasks, floor plans and There's a diversity of people. That's where AI becomes really useful for roboticists.”

Both systems approach these challenges in different ways.

RialTo — which Gupta created with a team at the Massachusetts Institute of Technology — takes someone through an environment and captures video of its geometry and moving parts. For example, in the kitchen, they will open the cupboards and the toaster and fridge. The system then uses existing AI models — and a human does some quick work through a graphical user interface to show how things move — to create a simulated version of the kitchen shown in the video. Took A virtual robot trains itself through trial and error in a simulated environment by repeatedly trying to perform tasks like opening a toaster oven — a method called reinforcement learning.

By going through this process in simulation, the robot improves on the task and works around errors or changes in the environment, such as a mug placed next to the toaster. The robot can then transfer this learning to a physical environment, where it is just as accurate as a trained robot in a real kitchen.

Another system, URDFormer, focuses less on precision in a kitchen. Instead, it combines hundreds of common kitchen impressions quickly and cheaply. URDFormer scans images from the Internet and combines them with existing models of how, for example, kitchen drawers and cabinets will move. It then predicts the simulation from an initial real-world image, allowing researchers to quickly and cheaply train robots in a wide range of environments. The trade-off is that these simulations are significantly less accurate than those that RialTo produces.

“Both approaches can complement each other,” Gupta said. “URDFormer is really useful for pre-training on hundreds of scenarios. RialTo is especially useful if you've already pre-trained a robot, and now you want to deploy it in someone's home and it's 95% can succeed.”

Moving forward, the RialTo team wants to deploy its system in people's homes (much of it has been tested in the lab), and Gupta said they'll work with the systems in the real world to improve their success rate. want to add a small amount of training data.

“Hopefully, just a small amount of real-world data can fix the failures,” Gupta said. “But we still have to figure out how best to combine data collected directly in the real world, which is expensive, with data collected in simulation, which is cheaper, but slightly is wrong.”

Additional co-authors on the URDFormer paper include UW's Aaron Walsman, Marius Memmel, Alex Fang — doctoral students at the Allen School; Kartikeya Vemori, undergraduate at the Allen School; Allen Wu, a master's student at the Allen School; and Kaichun Mo, a research scientist at NVIDIA. Allen School Professor Dieter Fox was a co-senior author. Additional co-authors on the URDFormer paper include MIT's Marcel Torne, Anthony Simeonov, Tao Chen — all doctoral students; Zichu Li, a research assistant; and April Chan, an undergraduate. Pulkit Aggarwal, an assistant professor at MIT, was a co-senior author. The URDFormer research was funded in part by the Amazon Science Hub. The RialTo research was partially funded by a Sony Research Award, the US government, and Hyundai Motor Company.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment