
Nemotron 3 Nano 30B-A3B
NVIDIA's hybrid Mamba2+MoE model packs 31.6B total parameters but activates only 3.2B per token, delivering frontier-class reasoning with 3.3x the throughput of comparable models on a single H200 GPU.

NVIDIA's hybrid Mamba2+MoE model packs 31.6B total parameters but activates only 3.2B per token, delivering frontier-class reasoning with 3.3x the throughput of comparable models on a single H200 GPU.

A data-driven comparison of Alibaba's Qwen3.5-35B-A3B and NVIDIA's Nemotron 3 Nano 30B-A3B - two ~30B MoE models activating ~3B parameters that take fundamentally different architectural approaches to the same problem.