AI is great at predicting text. Can it guide robots?

13 min

•Feb 11, 20265 months ago

Summary

The episode explores how AI is moving from the digital realm into physical robotics, examining both the promising developments and significant challenges. While AI-powered robots can learn tasks through demonstration, they currently struggle with consistency, require massive amounts of training data, and face fundamental limitations in generalizing to real-world scenarios.

Insights

AI chatbots succeed because they train on internet-scale data, but robotics lacks comparable training datasets, creating a fundamental bottleneck for AI robot development
Current AI robots achieve 75-90% task success rates but fail unpredictably, requiring human intervention to fix mistakes, making them impractical for widespread deployment
Simulation-based training shows promise for locomotion but struggles with complex manipulation tasks involving forces and object interactions that don't transfer reliably to physical reality
Practical near-term applications focus on AI enhancing specific robotic functions (vision, object recognition) rather than end-to-end autonomous task completion
The core challenge isn't data quantity alone but finding the right problem framing and learning approaches to enable robots to generalize beyond their training scenarios

Trends

AI-powered robotics moving from research labs into commercial applications (package sorting, laundry folding)Hybrid approach gaining traction: using AI for specific robotic subtasks rather than full autonomous controlSimulation-based training becoming critical for scaling robot learning without 100,000+ years of real-world data collectionMajor tech companies (Tesla, Google) investing heavily in humanoid robots despite unresolved fundamental challengesGrowing recognition that robotics AI problem is fundamentally different and harder than language model predictionSelf-teaching robots and robot-to-robot learning emerging as potential solutions to data scarcity problemCommercial robotics focusing on high-value, repetitive tasks (package sorting, laundry) rather than general-purpose automationGap widening between public AI hype and actual robotic capabilities in real-world deployment

Topics

AI neural networks for robotics Training data requirements for AI robots Simulation-based robot learning Humanoid robot development Robot vision and object recognition Task generalization in robotics Real-world robot deployment challenges AI chatbots vs. robotics AI differences Robotic manipulation and grasping Package sorting automation Laundry folding robots Robot learning from demonstration Physics simulation for robotics Commercial robotics applications AI safety in physical systems

Companies

Tesla

Unveiled humanoid robot Optimus powered by AI at 2024 marketing event, demonstrating AI application in robotics

Google

Unveiled humanoid robot operating with AI, specifically bringing Gemini 2.0 intelligence to robotic agents

OpenAI

Developed OpenVLA, an AI model powering teachable neural networks used in Stanford's robotic arm demonstrations

Physical Intelligence

Startup co-founded by Chelsea Finn demonstrating mobile robots that fold laundry using AI-trained systems

People

Chelsea Finn

Director of Stanford's IRIS Laboratory researching AI-powered robotics; co-founder of Physical Intelligence startup

Moojin Kim

Graduate student at Stanford IRIS Lab working on OpenVLA-powered robotic arms for task learning

Ken Goldberg

UC Berkeley professor; co-founder of AI-powered package sorting company using image recognition for robotics

Pulkit Agrawal

MIT researcher exploring simulation-based training to generate large-scale robot learning data efficiently

Matthew Johnson Roberson

Carnegie Mellon University researcher questioning fundamental problem framing in AI robotics development

Elon Musk

Tesla CEO who prominently featured humanoid robot Optimus at 2024 marketing event

Quotes

"In the long term, we want to develop software that would allow the robots to operate intelligently in any situation."

Chelsea Finn

"Robots are not going to suddenly become the science fiction dream overnight."

Ken Goldberg

"At this current rate, we're going to take 100,000 years to get that much data."

Ken Goldberg

"The question is not, do we have enough data? It is more, what is the framing of the problem?"

Matthew Johnson Roberson

"AI being used for parts of the robotic problem, you know, walking or vision or whatever. It's going to make big progress. It just may not arrive everywhere all at once."

Jeff Brumfield

Full Transcript

This is Ira Glass. On This American Life, we look for stories that are surprising, that you won't hear anywhere else. Like, for example, this one astronaut who went to the moon. You know what he's not into? Space. Was it cool to float around weightless? No, no, no. This American Life, unexpected stories, wherever you get your podcasts. You're listening to Shortwave from NPR. Hey Short Waivers, Regina Barber here. I don't know about you, but for me, recently, it seems like artificial intelligence is everywhere. It's in my search results, on everyone's phones, being pitched in my email, trying to read my emails. So I thought it would be a perfect time to revisit an episode we did last year with NPR science correspondent Jeff Brumfield. He noticed that AI isn't just showing up online anymore, it's creeping into reality. Like at Tesla's big marketing event in 2024. Yep, AI was there. Speaking of robots. Tesla is obviously a car company, but Elon Musk, Tesla's CEO, made a big part of the event about a humanoid robot powered by AI and called Optimus. The software, the AI inference computer, it all actually applies to a humanoid robot. And Google unveiled another humanoid robot that operates using AI. We're bringing Gemini 2.0's intelligence to general purpose robotic agents in the physical world. Okay, Jeff, but even before AI came along, people and companies have been making like big claims about robots. They have, they have. And the robots, as I'm sure you know, Gina, have always disappointed compared to the vision. Yeah, that's true. And that's why I set out to understand the truth about AI and robotics. The truth. And I think I kind of found it in Ebola trail mix. Today on the show, what happens when artificial intelligence moves out of the chat and into the real world? We're looking at how AI could maybe revolutionize robotics. You're listening to Shortwave, the science podcast from NPR. How could your favorite NPR podcast get any better? Well, what if it had bonus features such as extended interviews and zero sponsor breaks? There is a remarkably easy way to turn that fantasy into reality. It's called NPR+. You get perks across more than 25 NPR podcasts while supporting the teams that make them. Make great podcasts even greater by visiting plus.npr.org. On the ThruLine podcast from NPR, the former slaveholder who took on the KKK and won. He became more and more and more militant as time went on. Listen to ThruLine in the NPR app or wherever you get your podcasts. Okay, so Jeff, you were interested in finding out more about how AI works in robots. Where did you start? Well, I didn't go to Tesla or Google, but I did drive right by them on my way to Stanford University. Okay. And specifically the IRIS Laboratory, which stands for Intelligence Through Robotic Interaction at Scale. I got a tour from a graduate student named Moojin Kim. Moochim works on a new kind of robot powered by AI similar to the AI used in chatbots It one step in the direction of chatGPT for robotics but still a lot of work to do Okay all right Well you want to show me what I can do For sure. So Jeff, what did the robot look like? Well, this wasn't some humanoid robot that the big tech companies are rolling out. It's just a pair of mechanical arms with pinchers. Okay. But what made it interesting was that it's powered by an AI model called OpenVLA. So first, we should probably just say quickly, You know, a regular robot must be very, very carefully programmed. An engineer has to write it detailed instructions for every task you want it to perform. Yeah, and AI is supposed to change that. Exactly. And that's what's going on here. This robot is powered by a teachable AI neural network. The neural network operates kind of how scientists think the human brain might work. Basically, there are these mathematical nodes in the network that have billions of connections to each other in a way similar to how neurons in the brain are connected together. And so when you go to program this sort of thing, it's simply about reinforcing the connections that matter between the nodes and weakening the other ones that don't. So in practice, this means Mujin can just teach OpenVLA a task by showing it. So basically, whatever task you want to do, you just keep doing it over and over, maybe like 50 times or 100 times. The robot's AI neural network becomes tuned to that task, and then it can do it by itself. Yeah, it makes me think of this, like, smiling robot story we did, and that robot just watched, like, a lot of videos of people smiling, and then it learned how to do it. Yeah, it's exactly the same thing, except instead of just smiling, this robot's actually doing stuff. Right. So to show me, Moojim brought out a tray of different kinds of trail mix, and I typed in what I wanted it to do. Okay, so scoop some green ones with the nuts into the bowl. Oh, my gosh. See what happens. Oh, my gosh. Okay, so Jeff, personally, I've been waiting for something like AI and robotics because you can teach it to do something, you can ask it to do something to, like, make me an ice cream sundae or something without, like, any fancy programming or special knowledge. That's exactly it, you know? And this really is the dream of the researcher who runs this laboratory. Her name is Chelsea Finn. So in the long term, we want to develop software that would allow the robots to operate intelligently in any situation. And by intelligently, she means the robot could understand a simple command like scoop some green ones into a bowl or make me a sundae and then execute in the real world. Even just to do very basic things like being able to make a sandwich or being able to clean a kitchen or being able to restock grocery store shelves. These are simple tasks that could help humans do their jobs or do tasks at home. Now, Chelsea also has co-founded a startup called Physical Intelligence. It recently demonstrated a mobile robot that could take laundry out of a dryer and fold it. Again, this robot was taught by humans training its powerful AI program. Okay, so ice cream sundaes, is that too advanced? Is folding an easier start? I mean, I'd actually argue, Gina, that folding is harder. Okay. Let me show you a video. Okay, it's going to the dryer. It's pulling stuff out, putting it in a basket. it has the concentration I have when I'm going to do laundry. It almost looks like annoyed with folding like I do. Oh my God, it's doing really well, actually. Yes, it is, right? And this is a complicated task. It's got to pull these clothes out. It's got to figure out what they are. It doesn't even have a head, but I'm like giving it personality. It looks like it's like, oh I just got to fold another one Okay So is it really as simple as like just teaching a robot like what to do because if it was wouldn these robots be everywhere Yeah I mean right It looks cool on the video The truth is that you know when you get out and these robots are trying to do these tasks over and over again, they get confused, they misunderstand, they make mistakes and they just get stuck. So, you know, it might be able to fold laundry 90% of the time or 75% of the time, but the rest of the time it's going to make a big mess that then a human has to get in there and clean up. Got it. OK. I spoke to Ken Goldberg, a professor at the University of California at Berkeley, and he's pretty emphatic that AI powered robots weren't here yet. Robots are not going to suddenly become the science fiction dream overnight. OK, so like tell me why, because like AI chatbots have gotten like way better, super fast. So why are these robots getting stuck? OK, so it's true that AI has improved massively over the past couple years. But that's because chatbots have a huge amount of data to learn from. They've taken basically the entire internet to train themselves how to write sentences and draw pictures. But Ken says, for robotics, there's nothing. We don't have anything to start with, right? There's no examples online of robot commands being generated in response to robot inputs. And if robots really need as much training data as their virtual chatbot friends, then having humans teach them one task at a time is going to take a really long time. You know, at this current rate, we're going to take 100,000 years to get that much data. OK, that's so long. Like, are there any alternatives? There must be. Yeah. Well, scientists are exploring them right now. And one might be to let the AI brain of the robot learn in a simulation. A researcher who's trying this is a guy named Pulkit Agrawal. He's at the Massachusetts Institute of Technology. The power of simulation is that we can collect, you know, very large amounts of data. For example, in three hours, you know, worth of simulation, we can collect 100 days worth of data. So this is a really promising approach for some things, but it's much more of a challenge for others. So, for example, let's talk about walking. When you're just dealing with the Earth and your body, the physics of walking around, it's actually kind of simple. When you're doing locomotion, you know, you're mostly on Earth. You know, there's no amount of force you can apply which will make the Earth move. And so the simulation can do that reasonably well. But if you want your robot to, say, try and pick up a mug off a desk or something, that's a lot more complicated. Or forces. You know, if you apply the wrong forces, these objects can fly away very quickly. Basically, your robot will fling things across the room if it doesn't understand the weight and the size of what it's carrying. And there's more. You know, if your robot encounters anything that you haven't simulated 100% perfectly, then it won't know what to do. It'll just break. Okay, so it sounds like these simulations have limits and real-world training is going to take a while. I can begin to see why AI robots aren't going to be here tomorrow. Exactly. And some researchers think there are even deeper problems, actually, with trying to put AI into robotics. One of them is Matthew Johnson Roberson at Carnegie Mellon University in Pittsburgh. In my mind, the question is not, do we have enough data? It is more, what is the framing of the problem? So getting back to AI chatbots for a minute, Matt says for all their incredible skills, the task we're asking them to do is actually relatively simple. You know, you look at what a human user types and then try to predict the next words that user wants to see. Robots have so much more that they going to have to do than just compose a sentence Right Next best word prediction works really well And it a very simple problem because you just predicting the next word And it is not clear right now I can take 20 hours of GoPro footage and then produce anything sensible with respect to how a robot moves around in the world. So in other words, the sci-fi tasks that we want our robots to do are so complicated compared to sentence writing. No amount of data may be enough unless researchers can find the right way to teach the robots. Or have the robots teach the robots. Yes. That's also an option. They can teach themselves. Okay. So, Jeff, you've taken me from, like, optimist to pessimist. It's the, you know, the road I take every day. I'm starting to think that AI is, like, never going to work that well in robots or, like, it's going to be a really long time. You know, I'm sorry if I've, like, turned you into a pessimist here, Gina. It happens. And I'm going to have to sort of whipsaw you back because AI is already finding its way into robotics in ways that are really interesting. So, for example, Ken Goldberg has co-founded a package sorting company. And just this year, they started using AI image recognition to pick the best points for their robots to grab the packages. Ooh, okay. Yeah. And it's working really well, he told me. And I think we're going to see a lot of that. AI being used for parts of the robotic problem, you know, walking or vision or whatever. It's going to make big progress. It just may not arrive everywhere all at once. And to really end on a high note here, let's get back to that Stanford lab. Remember, I asked it to grab some trail mix, right? So the robot correctly identified the right bin to Moo Jin Kim's relief. Usually that spot right there where it identifies the object and goes to it, that's the part where we hold our breath in. And then very, very slowly and kind of hesitantly, it reached out with its claw and picked up the scoop. It's doing it. Rujin, did I just program a robot? You did. Looks like it's working. And to my mind, it's incredible. Like, remember, nobody really programmed the robot exactly. This is all neural network learning how to move the claws and respond to the commands on its own. And to me, it's pretty wild that that works at all. And I think it's going to lead to some very cool developments. I'm excited to hear more, Jeff. Thank you so much for bringing this reporting to us. Thank you very much. We'll link Jeff's full story, which has robot videos, in our episode notes. This episode was produced by Burley McCoy, edited by our showrunner Rebecca Ramirez, and fact-checked by Tyler Jones. Jimmy Keely was the audio engineer. I'm Jeff Brumfield. I'm Regina Barber. Thank you for listening to Shortwave from NPR. This week on Up First from NPR News, funding ran out for the Department of Homeland Security and Congress went home. DHS does a few important things, like secure the airports or the coasts or the president. Now their funding is uncertain. And what does this say about the way Congress works or doesn't? Follow us for the latest each morning on Up First on the NPR app or wherever you get your podcasts. I'm Mary Louise Kelly. World news is changing by the hour. On Sources and Methods, NPR's national security podcast, we zoom out to explain shifting alliances, global flashpoints, and what's really happening in places like Iran, Venezuela, Greenland. Our reporters on the ground connect the dots to help you understand a world order changing beneath our feet. Listen to sources and methods on the NPR app or wherever you get your podcasts.