NVIDIA GR00T N1.6 and Cosmos Reason: The Open AI Stack Powering the Humanoid Robot Revolution

NVIDIA just made humanoid robot development accessible. New open models—GR00T N1.6 for full-body control and Cosmos Reason 2 for physical world understanding—are now free on Hugging Face. Boston Dynamics, NEURA Robotics, and Richtech are already deploying them on Jetson Thor. Here's what changed and why it matters.

TL;DR

GR00T N1.6: open VLA model for humanoid full-body control (free on HuggingFace)
Cosmos Reason 2: open VLM for physical world reasoning and scene understanding
Cosmos Transfer 2.5 + Predict 2.5: physically-based synthetic data + policy evaluation
Jetson Thor: Blackwell SoC running locomotion + NLP + sensors simultaneously
Jetson T4000: $1,999, 4x performance of prior gen, 1,200 FP4 TFLOPS
Partners: Boston Dynamics, NEURA Robotics, Humanoid, Richtech, LEM Surgical, Salesforce

Why This Is NVIDIA's "Android Moment" for Robotics

When Android made smartphone development accessible, it kicked off a decade of mobile applications. NVIDIA is attempting the same move with physical AI. By releasing open models, an open SDK, and a standardized hardware platform, they're positioning Isaac + GR00T as the Android of humanoid robotics.

NVIDIA connects its 2 million robotics developers with Hugging Face's 13 million AI builders via the LeRobot framework integration — making GR00T N1.6 and Isaac Lab accessible to anyone who can write Python.

Previously, building a capable humanoid robot required months of pretraining a foundation model from scratch. GR00T N1.6 eliminates that step: developers can skip pretraining, fine-tune on their specific robot and task, and deploy on standardized Jetson Thor hardware. The barrier dropped from a $10M+ research project to a $2,000 hardware investment and a few weeks of fine-tuning.

The Full Model Stack: What NVIDIA Released

Model	Type	Function	Access
GR00T N1.6	VLA (Vision-Language-Action)	Full-body humanoid control; action generation from vision + language input	Free on Hugging Face
Cosmos Reason 2	VLM (Vision-Language Model)	Physical world reasoning; scene understanding; contextual planning for robots	Free on Hugging Face
Cosmos Transfer 2.5	World Model	Physically-based synthetic data generation for robot training	Free on Hugging Face
Cosmos Predict 2.5	World Model	Robot policy evaluation in simulation before real-world deployment	Free on Hugging Face
Isaac Lab-Arena	Simulation Framework	Large-scale robot policy benchmarking; standardized simulation environment	Open source (GitHub)
OSMO	Orchestration	Cloud-native unified command center for robotic dev workflows	NVIDIA Cloud

Jetson Thor Hardware: Specs and Pricing

The software stack runs on two new hardware platforms. The key advance is Multi-Instance GPU (MIG) technology that lets a single chip simultaneously handle bipedal locomotion, natural language processing, and multi-sensor fusion — all tasks that previously required separate compute modules.

Platform	Architecture	Performance	Use case	Price
Jetson AGX Thor	Blackwell, MIG	Full humanoid: locomotion + NLP + sensors on one SoC	Humanoid robots (Boston Dynamics, NEURA)	Partner pricing
Jetson T4000	Blackwell	1,200 FP4 TFLOPS — 4x prior gen	Autonomous machines, drones, industrial robots	$1,999 (1K unit vol.)
IGX Thor	Blackwell, industrial	Functional safety, enterprise software support	Industrial edge, manufacturing, surgery	Enterprise pricing

Real-World Deployments: Who's Using It Now

Boston Dynamics—Next-generation humanoid locomotion

Integrating Jetson Thor with GR00T-enabled workflows to simulate, train, and validate new robot behaviors for Stretch and Atlas successors.

NEURA Robotics—Cognitive humanoid robots

Using GR00T N1.6 and Cosmos Reason to build humanoids that can reason about their environment and execute complex manipulation tasks in human workspaces.

Richtech Robotics—Commercial service robots

Deploying Jetson Thor-powered robots in food service and hospitality environments, using GR00T's action generation for natural human-robot interaction.

Salesforce—Enterprise video intelligence

Using Agentforce + Cosmos Reason + NVIDIA Blueprint to analyze robot-captured video footage, reducing incident resolution times by 2x in warehouse operations.

LEM Surgical—AI-assisted surgical systems

Using Isaac for Healthcare and Cosmos Transfer to train autonomous surgical arms on the Dynamis robot, powered by Jetson AGX Thor with Holoscan.

Franka Robotics—Research and industrial manipulation

Using GR00T-enabled workflows for simulation-to-real transfer — training manipulation policies in Isaac Lab-Arena before deploying on physical Franka arms.

The Sim-to-Real Pipeline: How GR00T Training Works

The key insight in NVIDIA's stack is the simulation-to-real (sim-to-real) pipeline, which dramatically reduces the real-world data needed to train capable robots:

Generate synthetic data: Cosmos Transfer 2.5 creates physically accurate synthetic environments and robot scenarios, including edge cases that would be dangerous or expensive to create in the real world.
Pretrain on GR00T N1.6: Use the open model as a foundation; skip the trillion-token pretraining that took NVIDIA's research team months.
Fine-tune for your robot: Use Isaac Lab-Arena to benchmark and refine your robot's specific tasks in simulation — manipulation, navigation, human interaction.
Validate with Cosmos Predict 2.5: Evaluate the robot policy in simulated scenarios before physical deployment, catching failure modes before they damage hardware.
Deploy on Jetson Thor: Run inference directly on the robot with sub-50ms action latency — fast enough for real-time bipedal locomotion correction.

What This Means for the AI Agent Ecosystem

Most AI agent discussions focus on software agents — but physical AI agents are now on the same trajectory. The same pattern applies: NVIDIA is building the infrastructure layer (models + hardware + simulation) so that companies like Boston Dynamics and NEURA can focus on the application layer.

For enterprise AI teams, the immediate implication is that robots as AI agents are no longer a 5-year horizon. Companies that deployed warehouse robots in 2024 are now evaluating GR00T-powered upgrades that add cognitive reasoning and task adaptation — capabilities that previously required a full robotics research team to develop.

Frequently Asked Questions

What is NVIDIA GR00T N1.6?

An open Vision-Language-Action model for humanoid robots. It enables full-body control by processing visual and language inputs to generate robot actions. Free on Hugging Face, runs on Jetson Thor.

What is Cosmos Reason 2?

An open Vision-Language Model for physical world reasoning — lets robots see, understand, and plan actions in real environments. Works alongside GR00T N1.6 to add cognitive reasoning to physical control.

Which companies are using NVIDIA GR00T?

Boston Dynamics, NEURA Robotics, Humanoid, Richtech Robotics, Salesforce (video intelligence), LEM Surgical (surgical robots), and Franka Robotics.

What does Jetson Thor cost?

The Jetson T4000 module for autonomous machines is $1,999 at 1,000-unit volume. Jetson AGX Thor (for humanoids) is available through NVIDIA robotics partner pricing.

Can I build a humanoid robot with these open models?

Yes. GR00T N1.6 and Isaac technologies are integrated into Hugging Face's LeRobot framework. Hugging Face's Reachy 2 humanoid is fully interoperable with Jetson Thor — the full stack is open source.

Build AI Agent Workflows — Digital or Physical

Happycapy helps teams design and deploy agentic workflows — from software automation today to physical robot integration as the GR00T stack matures.

Try Happycapy Free

Sources

NVIDIA Hugging Face

← Back to all articles