NVIDIA Drops 110 Open-Source Skills for Physical AI Devs
NVIDIA's Agent Toolkit lands 110+ verified skills on GitHub covering robotics, autonomous vehicles, vision AI, and industrial systems - turning complex physical AI pipelines into single agent calls.

A robotics engineer at a consumer electronics factory needs synthetic defect images to train a visual inspection model. Without automation, this means wiring together Isaac Sim, a Cosmos world model, a labeling pipeline, and a training job - three or four days of Python glue code, environment debugging, and format juggling before a single training image is produced.
With NVIDIA's new Agent Toolkit skills, that same pipeline becomes a single agent call. The skill handles orchestration; the engineer writes a prompt.
That's the premise behind the 110+ open-source agent skills NVIDIA released on June 1 at GTC Taipei. The skills are available now at github.com/nvidia/skills and through skills.sh, compatible with any coding agent - Claude Code, Codex, or anything else that can run shell commands.
TL;DR
- 110+ verified skills on GitHub spanning robotics, AVs, vision AI, and industrial systems
- Installs via
npx skills add nvidia/skills; each skill includes agent instructions, governance metadata, and a cryptographic signature - Builds on Cosmos 3, Isaac Sim, Metropolis, Alpamayo, and Jetson
- Li Auto runs 1,000+ neural scene reconstructions and 300,000+ rendered frames daily using the underlying pipeline
- Security layer via NemoClaw and OpenShell is built in
The Problem These Skills Solve
Physical AI development is a pipeline problem. Training a robot to navigate a warehouse is not one task - it's six sequential tasks with incompatible APIs and independent failure modes. Agents can execute steps competently. They can't determine the sequence without guidance.
A skill fills that gap: it's a machine-readable document that tells an agent exactly what tools to call, what outputs to produce, and how to verify results. NVIDIA's skills library is a curated public collection of those documents, covering its own platforms.
"AI agents are revolutionizing software development, and that shift is now coming to physical AI," said Jensen Huang at GTC Taipei. "Developers can now use agents to build the robots, autonomous vehicles, and industrial systems of the future at an incredible pace."
NVIDIA's physical AI agent skills target factories, logistics centers, and autonomous vehicle fleets - environments where simulation-to-reality pipelines determine whether a model works in production.
Source: unsplash.com
How the Pipeline Runs
The synthetic data pipeline for a manufacturing inspection model, executed through NVIDIA agent skills, runs like this:
- Scene initialization - Agent calls the
cosmos-neural-reconstructionskill with a folder of raw camera frames - World model generation - Cosmos 3 reconstructs a 3D scene representation from the input
- Defect variation - Agent calls
metropolis-defect-image-generationwith defect type parameters (scratch, dent, missing component) - Synthetic frame generation - Cosmos renders photorealistic defect images at scale across the reconstructed scene
- Video augmentation - Agent calls
metropolis-video-augmentationto add lighting variation, occlusion, and sensor noise - Export formatting - Skill handles output to COCO or Pascal VOC format based on the downstream training framework
- Validation - Each skill ships with a
BENCHMARK.mdand Tier-3 evaluation datasets; agent verifies outputs meet quality thresholds - Training job handoff - Skill passes a structured manifest to the training orchestrator
The sequence runs without human intervention once the agent receives an initial prompt. Pegatron, which runs contract electronics manufacturing for multiple major hardware companies, reports a 67% reduction in training and deployment time using NVIDIA's synthetic data pipeline.
Step by Step
Installing Skills
Skills install through npm's npx toolchain:
# Install the full NVIDIA skills catalog
npx skills add nvidia/skills
# Install one skill directly, no prompts
npx skills add nvidia/skills --skill metropolis-defect-image-generation --yes
Each installed skill lands with three files: SKILL.md (agent-readable instructions), skill-card.md (governance metadata), and skill.oms.sig (a cryptographic signature verifiable against NVIDIA's root certificate). A skill file is text; a tampered skill is a supply chain vector. NVIDIA is treating provenance as infrastructure from day one. Instant H100-backed access is available through NVIDIA Brev, with cloud integrations from Microsoft Azure, CoreWeave, and Nebius.
Cosmos and Reconstruction Skills
The Cosmos 3 world foundation model underpins the reconstruction and generation skills. Neural Reconstruction takes lidar or multi-camera fleet data and builds a simulation-ready 3D scene. Li Auto and DeepRoute.ai use this at production scale - Li Auto reports 1,000+ neural reconstructions and over 300,000 rendered frames daily.
Isaac Sim and Robotics Skills
For robotics, the skills layer over Isaac Sim and Isaac Lab. An agent can launch a simulation session, author a scene, control robot actuators, capture training data, and run closed-loop evaluation - all through skill calls without custom orchestration code. Isaac Lab skills cover reinforcement learning setup, training runs, evaluation loops, and custom environment development. Agility Robotics, Universal Robots, and 1X Technologies are among the companies using these skills in active projects.
The June 1 announcement also included the Isaac GR00T Reference Humanoid Robot - a Unitree H2-based platform with 75 degrees of freedom and Jetson Thor compute, shipping late 2026. Research institutions including Stanford and ETH Zurich will use it to validate the full skills pipeline on physical hardware.
The Isaac GR00T Reference Humanoid Robot closes the sim-to-real loop: train in Isaac Sim using the new skills, confirm on a platform with known hardware characteristics.
Source: revolutioninai.com
Autonomous Vehicles and Vision AI
Alpamayo 2 Super, a 32-billion-parameter vision-language-action model for level-4 autonomous driving, gets its own skill layer. AlpaGym connects policy rollouts to high-fidelity simulation; OmniDreams produces photorealistic camera frames in real time that respond to policy actions. Li Auto generates novel driving scenarios at a rate that on-road data collection can't match, filling the long-tail problem without sending fleets into low-frequency edge cases.
For vision AI, Metropolis skills cover Defect Image Generation, Video Search and Summarization, and Video Augmentation. Delta Electronics improved inspection detection rates by 17%; Foxconn reports a roughly 3% gain in first-pass manufacturing yield using synthetic defect data from this pipeline.
Pipeline vs. Manual: What Changes
| Step | Manual approach | With NVIDIA skills |
|---|---|---|
| Scene reconstruction | Write custom Omniverse importers, manage file formats | cosmos-neural-reconstruction skill, one call |
| Defect data generation | Script image augmentation, manage GPU memory manually | metropolis-defect-image-generation, declarative parameters |
| RL training setup | Configure Isaac Lab environment by hand | Isaac Lab skills handle environment authoring and training loop |
| Policy evaluation | Write eval harness, manage simulation/policy handoffs | Skills include validation against Tier-3 eval datasets |
| Export and handoff | Custom format conversion per downstream framework | Skill outputs structured manifest in standard formats |
Where It Breaks
Hardware requirements are steep. NVIDIA Brev provides instant H100 access, but running Cosmos reconstruction locally requires at minimum an A100 with 80 GB VRAM. Developers without cloud credits or high-end workstations will hit resource limits fast.
API versioning is a real risk. Skills encode specific library versions and API contracts. As Isaac Sim, Cosmos, and Alpamayo update, skills drift out of sync unless the catalog is actively maintained. NVIDIA maintains Tier-3 eval datasets per skill, but sustaining 110+ skills across six platforms is a significant ongoing commitment.
Agent reliability isn't guaranteed. A skill tells an agent what to do; it can't force the agent to execute correctly. Physical AI pipelines are long enough that built up errors - wrong parameter types, mismatched output formats, GPU OOM conditions - can fail silently without a well-tested agent harness. The benchmark files test skill correctness, not agent behavior under real conditions.
Physical AI has a long plumbing problem: the gap between a model capability and a production workflow has historically been filled with brittle custom code that breaks on every library update. Packaging that plumbing into verifiable, agent-executable skills with signed provenance is the right direction. The open question is whether NVIDIA can sustain the maintenance load as the six platforms underneath these skills keep shipping new releases.
Sources:
- NVIDIA Releases Major Collection of Open Source Agent Tools and Skills for Physical AI - NVIDIA Newsroom, June 1, 2026
- NVIDIA Enables Physical AI Research With Agent Skills at CVPR - NVIDIA Blog
- NVIDIA Releases New and Updated Tools for Physical AI at GTC Taipei - The Robot Report
- NVIDIA Agent Skills GitHub Repository - github.com/nvidia/skills
- NVIDIA Isaac GR00T Reference Humanoid Robot - NVIDIA Newsroom
