Models

Fara-1.5

Microsoft Research's family of open-weight browser computer use agents (4B, 9B, 27B) that beat OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web.

ERNIE 5.1

Baidu's ERNIE 5.1 is a text-focused MoE model that claims the top Chinese model slot on LMArena with 800B parameters built at 6% of comparable training costs.

MAI-Transcribe-1.5

Microsoft's second-generation speech-to-text model with 2.4% WER, 43-language support, keyword biasing, and 5x faster long-audio processing than comparable accuracy models.

Voxtral TTS

Mistral's first open-weight text-to-speech model: 4B parameters, 70ms latency, voice cloning from 3 seconds of audio, and a 68.4% win rate over ElevenLabs Flash v2.5 in blind tests.

GPT-5.1

GPT-5.1 is OpenAI's November 2025 coding and agentic flagship with 400K context, configurable reasoning effort, and 76.3% on SWE-bench Verified.

Alibaba's generalist VLA model for robotic manipulation, built on Qwen3.5-4B with a DiT action decoder, trained on 38,100+ hours of open-source data, and ranked first on the RoboChallenge generalist track.

Qwen3.7-Plus

Alibaba's first multimodal agent model, combining GUI grounding (ScreenSpot Pro 79.0), 1M-token context, and text-plus-vision input at $0.40/M tokens.

GLM-5.2

Z.ai's GLM-5.2 is a 744B open-weight MoE model with a 1M token context window, MIT license, and first-day support for eight coding agents at roughly 1/10th the cost of US frontier models.

Kimi K2.7-Code

Moonshot AI's Kimi K2.7-Code is a 1T-parameter open-weight MoE coding model with mandatory thinking mode, 256K context, and 30% fewer reasoning tokens than K2.6.

MAI-Thinking-1

Microsoft's first in-house reasoning model, a 35B-active sparse MoE with 256K context, 97% on AIME 2025, and no distillation from third-party labs.

DiffusionGemma 26B

DiffusionGemma 26B is Google DeepMind's open-weight discrete diffusion language model that generates 256 tokens in parallel, reaching 1,100+ tokens/sec on H100 - roughly 4x faster than autoregressive models of the same size.

Claude Fable 5

Claude Fable 5 is Anthropic's first publicly available Mythos-class model, with safety classifiers that fall back to Claude Opus 4.8 for high-risk requests across cybersecurity, biology, and chemistry.

← Previous