Articles Tagged "Diffusion Models"

NVIDIA SANA-WM

NVIDIA SANA-WM

NVIDIA's SANA-WM is a 2.6B-parameter hybrid linear diffusion transformer that generates 60-second 720p video with 6-DoF camera control on a single H100, built for embodied AI and robotics simulation.

NVIDIA SANA-WM - Minute-Scale Video on One GPU

NVIDIA SANA-WM - Minute-Scale Video on One GPU

NVIDIA NVLabs open-sourced SANA-WM, a 2.6B-parameter world model that generates 60-second 720p camera-controlled video on a single GPU, outperforming 14B+ competitors that need 8 GPUs.

Stable Audio 3.0 Ships Open Weights, 6-Min Songs

Stable Audio 3.0 Ships Open Weights, 6-Min Songs

Stability AI releases Stable Audio 3.0 as a four-model family with a new SAME autoencoder, open weights for three of four variants, and tracks up to 6 minutes 20 seconds - while Suno and Udio face ongoing copyright lawsuits over their training data.

HiDream-O1-Image

HiDream-O1-Image

HiDream-O1-Image is an 8B open-source text-to-image model with a pixel-space diffusion architecture that outperforms 32B FLUX.2 [dev] across five major benchmarks.

Ideogram 3.0

Ideogram 3.0

Ideogram 3.0 is Ideogram AI's most capable text-to-image model, leading the field in typography accuracy at ~90-95% and offering production-ready API access at $0.03-$0.09 per image.

Veo 3.1

Veo 3.1

Google DeepMind's Veo 3.1 generates 4K video with native audio and is now free for every Google account at 10 clips per month via Google Vids.

MAI-Image-2-Efficient

MAI-Image-2-Efficient

Microsoft's production-focused image generation model - 41% cheaper and 22% faster than MAI-Image-2, optimized for high-volume enterprise workflows.

LTX-2.3

LTX-2.3

LTX-2.3 is a 22-billion-parameter open-source video generation model from Lightricks that produces native 4K video with synchronized audio in a single diffusion pass.

Helios

Helios

Helios is a 14B open-source autoregressive diffusion model that generates minute-long videos at 19.5 FPS on a single H100, matching 1.3B distilled model speeds at full 14B quality.