Entrance Desk Data Concierge assisted GTC 2026 guests with navigation and occasion data. | Credit score: The Robotic Report
In simply over a yr, IntBot Inc. has gone from idea to full‑physique humanoids greeting 1000’s of company at NVIDIA’s GTC and in resort lobbies. The Sunnyvale, Calif.-based startup has used 24/7 interplay footage and sentiment evaluation to coach a social intelligence engine that sits on prime of off‑the‑shelf {hardware}.
At GTC 2026, CEO Lei Yang introduced that the firm‘s IntEng “basic social intelligence engine” now helps a number of humanoid and service robots from completely different {hardware} distributors. He mentioned this marked a big step towards hardware-agnostic deployment of socially clever robots in real-world environments.
This full-length view of the IntBot humanoid reveals its ft, though the robotic was stationary and secured to face for its shift on the GTC26 assist desk. | Credit score: The Robotic Report
IntBot additionally showcased the primary edge deployment of the NVIDIA Cosmos Cause-2 vision-language mannequin (VLM) inside its software program stack. Operating immediately on robotic edge compute programs, the mannequin allows robots to carry out real-time scene understanding, permitting them to interpret advanced human environments corresponding to crowded convention areas.
“The primary-generation robotic was a pre-programmed form of motion. However for our robotic, when you had been at CES or watched the movies, all of the feelings are generated,” acknowledged Yang. “So even if you’re not speaking to the robotic, the robotic would reply with some very pure, very refined movement, simply the very facet nodding, to point out ‘Okay, I’m listening,’ and even the form of movement to point ‘I’m alive.’ Every little thing is pushed by our social intelligence.”
Operating immediately on robotic edge compute programs, the mannequin allows robots to carry out real-time scene understanding, permitting them to interpret advanced human environments corresponding to crowded convention areas.
In response to Yang, ItBot’s robots use a type of audio-visual fusion, combining what it hears with what it sees, to raised perceive who within the scene is speaking and what the audio system’ intent is perhaps. This permits the robotic to offer a extra pure interplay with people.
IntBot stays platform-neutral
Whereas most humanoid startups chase ever-better locomotion and manipulation, IntBot is intentionally staying “{hardware} agnostic,” positioning its software program as a social intelligence layer that may experience on prime of no matter platforms the trade produces subsequent.
Immediately, that stack powers Nilo, a full-body humanoid that works 24/7 as a multilingual concierge in resort lobbies from New York to Las Vegas, mixing on-device notion and body-language technology with cloud LLMs for deeper queries.
“Proper now, we have already got three lodges throughout the U.S.,” mentioned Yang. “[We’re at] The Nap York in New York Metropolis, and a second one is named Otonomous in Las Vegas, and the third one is a Marriott Resort in Tulsa, Okla. And all of those three robots function 24/7, mainly. They work alongside their human employees members, however IntBot affords add-on capabilities to reinforce what the human employees, concierge, or [other] individuals can do.”
By focusing first on noisy, real-world environments like CES and busy resort lobbies, the place earlier kiosk-style programs and robots like Pepper stumbled, Lei Yang is betting that mastering pure, multi-party interplay would be the key to getting humanoids accepted as on a regular basis co-workers, not simply trade-show spectacles.
IntEngine coordinates notion, communication in actual time
Nylo’s means to function autonomously on the GTC present flooring is powered by IntEngine, IntBot’s proprietary, multimodal, multi-loop social intelligence system. IntEngine fuses imaginative and prescient, audio, and language in actual time to coordinate speech, facial features, and gesture—enabling robots to understand social context and reply naturally.
This structure permits Nylo not simply to reply, however to resolve when and easy methods to interact, which Yang mentioned is an important functionality for working in open, public environments.


