This story was initially printed in The Spotlight, Vox’s member-exclusive journal. To get entry to member-exclusive tales each month, be a part of the Vox Membership program in the present day.
There’s one thing unhappy about seeing a humanoid robotic mendacity on the ground. With none electrical energy, these bipedal machines can’t arise, so in the event that they’re powered down and never hanging from a winch, they’re sprawled out on the ground, staring up at you, helpless.
That’s how I met Atlas a few months in the past. I’d seen the robotic on YouTube 100 occasions, operating impediment programs and doing backflips. Then I noticed it on the ground of a lab at MIT. It was simply mendacity there. The distinction is jarring, if solely as a result of humanoid robots have turn out to be a lot extra succesful and ubiquitous since Atlas acquired well-known on YouTube.
Throughout city at Boston Dynamics, the corporate that makes Atlas, a more moderen model of the humanoid robotic had discovered not solely to stroll but additionally to drop issues and decide them again up instinctively, because of a single synthetic intelligence mannequin that controls its motion. A few of these next-generation Atlas robots will quickly be engaged on manufacturing facility flooring — and should enterprise additional. Thanks partially to AI, general-purpose humanoids of all kinds appear inevitable.
“In Shenzhen, you may already see them strolling down the road each now and again,” Russ Tedrake advised me again at MIT. “You’ll begin seeing them in your life in locations which are most likely boring, soiled, and harmful.”
Tedrake runs the Robotic Locomotion Group on the MIT Pc Science and Synthetic Intelligence Lab, also called CSAIL, and he co-led the mission that produced the most recent AI-powered Atlas. Strolling was as soon as the onerous factor for robots to be taught, however not anymore. Tedrake’s group has shifted focus from educating robots how you can transfer to serving to them perceive and work together with the world by way of software program, particularly AI. They’re not the one ones.
In america, enterprise capital funding in robotics startups grew from $42.6 million in 2020 to almost $2.8 billion in 2025. Morgan Stanley predicts the cumulative world gross sales of humanoids will attain 900,000 in 2030 and explode to greater than 1 billion by 2050, the overwhelming majority of which might be for industrial and business functions. Some consider these robots will finally substitute human labor, ushering in a brand new world financial order. In spite of everything, we designed the world for people, so humanoids ought to be capable of navigate it with ease and do what we do.
They gained’t all be manufacturing facility staff, if sure startups get their manner. An organization referred to as X1 Applied sciences has began taking preorders for its $20,000 residence robotic, Neo, which wears garments, does dishes, and fetches snacks from the fridge. Determine AI launched its Determine 03 humanoid robotic, which additionally does chores. Sunday Robotics stated it might have totally autonomous robots making espresso in beta testers’ properties subsequent 12 months.
To this point, we’ve seen loads of demos of those AI-powered residence robots and guarantees from the economic humanoid makers, however not a lot in the way in which of a brand new world financial order. Demos of residence robots, just like the X1 Neo, have relied on human operators, making these automatons, in observe, extra like puppets. Experiences recommend that Determine AI and Apptronik have just one or two robots on manufacturing flooring at any given time, normally doing menial duties. That’s a proof of idea, not a menace to the human work drive.
“So as to make them higher, we’ve to make AI higher.”
You’ll be able to consider all these robots because the bodily embodiment of AI, or simply embodied AI. That is what occurs whenever you put AI right into a bodily system, enabling it to work together with the true world. Whether or not that’s within the type of a humanoid robotic or an autonomous automotive, it’s the subsequent frontier for {hardware} and, arguably, technological progress writ massive.
Embodied AI is already reworking how farming works, how we transfer items world wide, and what’s potential in surgical theaters. We could be only one or two breakthroughs away from strolling, speaking, pondering machines that may work alongside us, unlocking an entire new realm of potentialities. “May” is the important thing phrase there.
“If we’re on the lookout for robots that can work aspect by aspect with us within the subsequent couple of years, I don’t assume it is going to be humanoids,” Daniela Rus, director of CSAIL, advised me not lengthy after I left Tedrake’s lab. “Humanoids are actually difficult, and we’ve to make them higher. And as a way to make them higher, we’ve to make AI higher.”
So to grasp the hole between the hype round humanoids and the expertise’s actual promise, it’s important to know what AI can and may’t do for robots. You additionally, sadly, must attempt to perceive what Elon Musk has been as much as at Tesla for the previous 5 years.
It’s nonetheless embarrassing to look at the a part of the Tesla AI Day presentation in 2021 when a human individual wearing a robotic costume seems on stage dancing to dubstep music. Musk ultimately stops the dance and pronounces that Tesla, “a robotics firm,” can have a prototype of a general-purpose humanoid robotic, now often known as Optimus, the next 12 months. Not many individuals believed him, and now, years later, Tesla nonetheless has not delivered a totally useful Optimus. By no means afraid to make a prediction, Musk advised audiences at Davos in January 2026 that Tesla’s robotic will go on sale subsequent 12 months.
“Folks took him significantly as a result of he had an important monitor file,” stated Ken Goldberg, a roboticist on the College of California-Berkeley and co-founder of Ambi Robotics. “I feel folks have been impressed by that.”
You’ll be able to think about why folks acquired excited, although. With the Optimus robotic, Elon Musk promised to remove poverty and provide shareholders “infinite” earnings. He stated engineers might successfully translate Tesla’s self-driving automotive expertise into software program that would energy autonomous robots that would work in factories or assist round the home. It’s a model of the identical imaginative and prescient humanoid robotics startups are chasing in the present day, albeit coloured by a number of years of Musk’s unfulfilled guarantees.
We now know that Optimus struggles with loads of the identical issues as different makes an attempt at general-purpose humanoids. It usually requires people to remotely function it, and it struggles with dexterity and precision. The 1X Neo, likewise, wanted a human’s assist to open a fridge door and collapsed onto the ground in a demo for a New York Instances journalist final 12 months. The {hardware} appears succesful sufficient. Optimus can dance, and Neo can fold garments, albeit a bit clumsily. However they don’t but perceive physics. They don’t know how you can plan or to improvise. They actually can’t assume.
“Folks usually get too excited by the thought of the robotic and never the fact.”
“Folks usually get too excited by the thought of the robotic and never the fact,” stated Rodney Brooks, co-founder of iRobot, makers of the Roomba robotic vacuum. Brooks, a former CSAIL director, has written extensively and skeptically about humanoid robots.
Clearly, there’s a spot between what’s occurring in analysis labs and what’s being deployed in the true world. Among the optimism round humanoids is predicated on good science, although. In 2023, Tedrake coauthored a landmark paper with Tony Zhao, co-founder and CEO of Sunday Robotics, that outlined a novel technique for coaching robots to maneuver like people. It includes people performing the duty carrying sensor-laden gloves that ship information to an AI mannequin that permits the robotic to determine how you can do these duties. This complemented work Tedrake was doing on the Toyota Analysis Institute that used the identical sorts of strategies AI fashions use to generate pictures to generate robotic habits. You’ve heard of enormous language fashions, or LLMs. Tedrake calls these massive habits fashions, or LBMs.
It is sensible. By watching people do issues time and again, these AI fashions accumulate sufficient information to generate new behaviors that may adapt to altering environments. Folding laundry, for instance, is a well-liked instance of a job that requires nimble arms and higher brains. If a robotic picks up a shirt and the material flops down in an sudden manner, it wants to determine how you can deal with that uncertainty. You’ll be able to’t merely program it to know what to do when there are such a lot of variables. You’ll be able to, nevertheless, train it to be taught.
That’s what makes the lemonade demo so spectacular. A few of Rus’s college students at CSAIL have been educating a humanoid robotic named Ruby to make lemonade — one thing that you may want a robotic butler to do sooner or later — by carrying sensors that measure not solely the actions however the forces concerned. It’s a mixture of delicate actions, like pouring sugar, and powerful ones, like lifting a jug of water. I watched Ruby do that with out spilling a drop. It hadn’t been programmed to make lemonade. It had discovered.
The actual problem is getting this technique to scale. A method is just to brute-force it: Make use of hundreds of people to carry out primary duties, like folding laundry, to construct basis fashions for the bodily world. Basis fashions are the huge datasets that may be tailored to particular duties like producing textual content, pictures, or on this case, robotic habits. You may also get people to teleoperate numerous robots as a way to practice these fashions. These so-called arm farms exist already in warehouses in Jap Europe, and they’re about as dystopian as they sound.
An alternative choice is YouTube. There are loads of how-to movies on YouTube, and a few researchers assume that feeding all of them into an AI mannequin will present sufficient information to offer robots a greater understanding of how the world works. These two-dimensional movies are clearly restricted, if solely as a result of they’ll’t inform us something concerning the physics of the objects within the body. The identical goes for artificial information, which includes a pc quickly and repeatedly finishing up a job in a simulation. The upside right here, in fact, is extra information, extra rapidly. The draw back is that the information isn’t pretty much as good, particularly relating to bodily forces like friction and torque, which additionally occur to be an important for robotic dexterity.
“Physics is a tricky job to grasp,” Brooks stated. “And you probably have a robotic, which isn’t good with physics, within the presence of individuals, it doesn’t finish nicely.”
That’s not even making an allowance for the various different bottlenecks going through robotics proper now. Whereas parts have gotten cheaper — you should buy a humanoid robotic proper now for lower than $6,000, in comparison with the $75,000 it value to purchase Boston Dynamics’ small, four-legged robotic Spot 5 years in the past — batteries characterize a serious bottleneck for robotics, limiting the run time of most humanoids to two to 4 hours.
Then you could have the issue with processing energy. The AI fashions that may make humanoids extra human require large quantities of compute. If that’s accomplished within the cloud, you’ve acquired latency points, stopping the robotic from reacting in actual time. And inevitably, to tie loads of different constraints right into a tidy bundle, the AI is simply not ok.
Should you hint the historical past of AI and the historical past of robotics again to their origins, you’ll see a braided line. The 2 applied sciences have intersected again and again, because the start of the time period “synthetic intelligence” at a Dartmouth summer time analysis workshop in the summertime of 1956. Then, half a century later, issues began heating up on the AI entrance, when advances in machine studying and highly effective processors referred to as GPUs — the issues which have now made Nvidia a $5 trillion firm — ushered within the period of deep studying. I’m about to throw a couple of technical phrases at you, so bear with me.
Machine studying is a sort of AI. It’s when algorithms search for patterns in information and make selections with out being explicitly educated to take action. Deep studying takes it to a different stage with the assistance of a machine studying mannequin referred to as a neural community. You’ll be able to consider a neural community, an idea that’s even older than AI, as a system loosely modeled on the human mind that’s made up of plenty of synthetic neurons that do math issues. Deep studying makes use of multilayered neural networks to be taught from large information units and to make selections and predictions. Amongst different accomplishments, neural networks have revolutionized pc imaginative and prescient to enhance notion in robots.
There are completely different architectures for neural networks that may do various things, like acknowledge pictures or generate textual content. One known as a transformer. The “GPT” in ChatGPT stands for “generative pre-trained transformer,” which is a sort of enormous language mannequin, or LLM, that powers many generative AI chatbots. When you’d assume LLMs can be good at making robots assume, they actually aren’t. Then there are diffusion fashions, which are sometimes used for picture era and, extra not too long ago, making robots seem to assume. The framework that Tedrake and his coauthors described of their 2023 analysis into utilizing generative AI to coach robots is predicated on diffusion.
“Below the hood, what’s really happening must be one thing way more like our personal brains.”
Three issues stand out on this very restricted rationalization of how AI and robots get alongside. One is that deep studying requires an enormous quantity of processing energy and, because of this, an enormous quantity of power. The opposite is that the most recent AI fashions work with the assistance of stacks of neural networks whose tens of millions and even billions of synthetic neurons do their magic in mysterious and normally inefficient methods. The third factor is that, whereas LLMs are good at language, and diffusion fashions are good at pictures, we don’t have any fashions which are ok at physics to ship a 200-pound robotic marching right into a crowd to shake arms and make mates.
As Josh Tenenbaum, a computational cognitive scientist at MIT, defined to me not too long ago, an LLM could make it simpler to speak to a robotic, however it’s hardly able to being the robotic’s brains. “You might think about a system the place there’s a language mannequin, there’s a chatbot, you need to discuss to your robotic,” Tenenbaum stated. “Below the hood, what’s really happening must be one thing way more like our personal brains and minds or different animals, not simply people by way of the way it’s embodied and offers with the world.”
So we’d like higher AI for robots, if not usually. Scientists at CSAIL have been engaged on a few physics-inspired and brain-like applied sciences they’re calling liquid neural networks and linear optical networks. They each fall into the class of state-space fashions, that are rising as an alternate or rival to transformer-based fashions. Whereas transformer-based fashions have a look at all accessible information to establish what’s necessary, state-space fashions are way more environment friendly, as they keep a abstract of the world that will get up to date as new information is available in. It’s nearer to how the human mind works.
To be completely trustworthy, I’d by no means heard of state-space fashions till Rus, the CSAIL director, advised me about them once we chatted in her workplace a couple of weeks in the past. She pulled up a video for instance the distinction between a liquid neural community and a standard mannequin used for self-driving automobiles. In it, you may see how the standard mannequin focuses its consideration on every part however the street, whereas the newer state-space mannequin solely seems on the street. If I’m driving in that automotive, by the way in which, I would like the AI that’s watching the street.
“And as a substitute of 100 thousand neurons,” Rus says, referring to the standard neural community, “I’ve solely 19.” And right here’s the place it will get actually compelling. She added, “And since I’ve solely 19, I can really work out how these neurons hearth and what the correlation is between these neurons and the motion of the automotive.”
You might have already heard that we don’t actually know the way AI works. If newer approaches convey us slightly bit nearer to comprehension, it actually appears value taking them significantly, particularly if we’re speaking concerning the sorts of brains we’ll put in humanoid robots.
When a humanoid robotic loses energy, when electrical energy stops flowing to the motors that hold it upright, it collapses right into a heap of heavy steel elements. This will occur for any variety of causes. Possibly it’s a bug within the code or a misplaced wifi connection. And after they’re on, humanoids are stuffed with power as their joints battle gravity or stand able to bend. Should you think about being on the incorrect aspect of that unbelievable mechanical energy, it’s straightforward to doubt this expertise.
Some corporations that make humanoid robots additionally admit that they’re not very helpful but. They’re too unreliable to assist out round the home, and so they’re not environment friendly sufficient to be useful in factories. Moreover, many of the cash being spent creating robots is being spent on making them protected round folks. Relating to deploying robots that may contribute to productiveness, that may take part within the financial system, it makes much more sense to make them extremely specialised and never human-shaped.
“Let’s not do open coronary heart surgical procedure straight away with these items.”
The embodied AI that can remodel the world within the close to future is what’s already on the market. In reality, it’s what’s been on the market for years. Early self-driving automobiles date again to the Eighties, when Ernst Dickmanns put a vision-guided Mercedes van on the streets of Munich. Researchers from Carnegie Mellon College acquired a minivan to drive itself throughout america in 1995. Now, a long time later, Waymo is working its robotaxi service in a half-dozen American cities, and the corporate says its AI-powered automobiles really make the roads safer for everybody.
Then there are the Roombas of the world, the robots which are designed to do one factor and hold getting higher at it. You’ll be able to embody the huge array of more and more clever manufacturing and warehouse robots on this camp too. By 2027, the 12 months Elon Musk is on monitor to overlook his deadline to begin promoting Optimus humanoids to the general public, Amazon will reportedly substitute greater than 600,000 jobs with robots. These would most likely be boring robots, however they’re protected and efficient.
Science fiction promised us humanoids, nevertheless. Choose an period in human historical past, in truth, and somebody was dreaming about an automaton that would transfer like us, discuss like us, and do all our soiled work. Replicants, androids, the Mechanical Turk — all these humanoid fantasies imagined an clever artificial self.
Actuality gave us package-toting platforms on wheels roving round Amazon warehouses or the sensor-heavy self-driving automobiles clogging San Francisco streets. In time, even the skeptics assume that humanoids might be potential. Most likely not in 5 years, however perhaps in 50, we’ll get artificially clever companions who can stroll alongside us. They’ll take child steps.
“Good robots are going to be clumsy at first, and it’s important to discover purposes the place it’s okay for the robotic to make errors after which recuperate,” Tedrake stated. “Let’s not do open-heart surgical procedure straight away with these items. That is extra like folding laundry.”


