Reasoning

GPT-5.4

GPT-5.4

OpenAI's most capable frontier model combines native computer use, 1M-token context, and three variants at $2.50/$15 per million tokens.

Gemini 3 Deep Think

Gemini 3 Deep Think

Google DeepMind's reasoning mode scores 84.6% on ARC-AGI-2, 3455 Codeforces Elo, and solves 18 previously unsolved research problems - outpacing Claude Opus 4.6 and GPT-5.2 on reasoning-heavy tasks.