The gig workers who are training humanoid robots at home
12 min
•Apr 8, 202611 days agoSummary
Gig workers in Nigeria, India, and Argentina are being hired by companies like Micro One to record themselves performing household chores, providing training data for humanoid robots. While the work pays well locally, it raises significant concerns around privacy, data security, informed consent, and whether home-based data collection is reliable enough to safely train robots.
Insights
- Real-world movement data from gig workers is becoming the primary training method for humanoid robots, replacing virtual simulations that cannot accurately model physics
- The humanoid robot training data market represents a $100M+ annual opportunity, with over $6B invested in humanoid robotics in 2025 alone
- Workers lack transparency about how their data is used, stored, and shared with third-party robotics companies, creating ethical and informed consent gaps
- Home-based data collection creates practical challenges around privacy, safety standards, and data quality control that the industry has not yet solved
- The scale of data needed to train humanoid robots may exceed current collection rates by orders of magnitude, potentially requiring years of continued data gathering
Trends
Distributed gig economy model for AI/robotics training data collection across 50+ countriesShift from synthetic simulation to real-world human movement data as the gold standard for robot trainingPrivacy and consent gaps widening as biometric and behavioral data collection becomes more intimate and distributedEmergence of data annotation and collection as a significant gig work category in developing economiesRobotics companies outsourcing data collection to specialized intermediaries rather than collecting in-houseAI-assisted quality control and annotation of training data becoming standard practiceRegulatory and ethical frameworks lagging behind rapid commercialization of human-generated training dataCompetition intensifying among data collection companies (Micro One, Scale AI, Encore) for worker recruitmentIntegration of exoskeletons and VR headsets in state-sponsored robot training programs (China model)Uncertainty about data sufficiency and safety standards for deploying robots trained on home-based behavioral data
Topics
Humanoid Robot Training Data CollectionGig Economy and Remote WorkAI Training Data Privacy and ConsentRobotics and AutomationData Annotation and LabelingInformed Consent in AI DevelopmentInternational Labor and Gig WorkRobot Safety and Real-World DeploymentSynthetic vs. Real-World Training DataData Quality Control and ValidationBiometric Data Collection EthicsDeveloping Economy Tech JobsThird-Party Data Sharing PracticesLarge Language Models and Robotics Paradigm ShiftOccupational Safety in Robot Training
Companies
Micro One
US-based data collection company in Palo Alto hiring thousands of gig workers globally to record movement data for ro...
Tesla
Mentioned as one of the major companies racing to build humanoid robots that rely on real-world training data
Figure AI
Humanoid robotics company competing to build robots designed to work in factories and homes using collected movement ...
Agility Robotics
Robotics company developing humanoid robots that are trained using real-world movement data from gig workers
Scale AI
Data collection company recruiting armies of data recorders and has gathered over 100,000 hours of training footage
Encore
Data collection company recruiting workers to record movement data for robotics training
DoorDash
Paying delivery drivers to film themselves doing chores to contribute to humanoid robot training datasets
ASTM International
Standards organization where roboticist Aaron Prather raises concerns about safety and data quality in robot training
People
Matt Honan
Host and editor-in-chief introducing the episode
Michelle Kim
Journalist who wrote the original article about gig workers training humanoid robots
Ali Ansari
Defends company practices and estimates $100M+ annual spending on real-world training data by robotics companies
Yasmin Katturi
Expert advocating for worker transparency and informed consent in robot training data collection
Aaron Prather
Raises concerns about safety standards and data quality when training robots on home-based behavioral data
Ken Goldberg
Estimates humanoid robots may require more training data than large language models and years to achieve sufficiency
Quotes
"This would be a real nice opportunity to set a mark and give data that will be used to train robots in the future"
Zeus (pseudonym)•Early in episode
"I really do not like it so much. I'm the kind of person that requires a technical job that requires me to think."
Zeus (pseudonym)•Mid-episode
"There is a lot of demand and it's increasing really fast"
Ali Ansari, CEO of Micro One•Mid-episode
"It is important that if workers are engaging in this, that they are informed by the companies themselves of the intention, where this kind of technology might go and how that might affect them longer term"
Yasmin Katturi, University of Maryland•Mid-episode
"It's going to take longer than people think"
Ken Goldberg, UC Berkeley•Late in episode
Full Transcript