
Inception Ships Mercury 2 - A Diffusion LLM That Hits 1,009 Tokens Per Second
Inception Labs launches Mercury 2, the first diffusion-based reasoning language model, generating over 1,000 tokens per second on Blackwell GPUs at a fraction of the cost of conventional autoregressive models.