Models

Qwen3.5-Omni

Qwen3.5-Omni

Alibaba's Qwen3.5-Omni takes text, images, audio, and video as input and streams both text and speech output in a single end-to-end model with a 256K context window.

GPT-5.4-Cyber

GPT-5.4-Cyber

OpenAI's GPT-5.4-Cyber is a cyber-permissive fine-tune of GPT-5.4 Thinking with binary reverse engineering, 88.23% on professional CTFs, and access gated through the Trusted Access for Cyber program.

Gemini 3.1 Flash TTS

Gemini 3.1 Flash TTS

Google's Gemini 3.1 Flash TTS ships in preview with 30 voices, 70-plus languages, 200-plus inline audio tags, and Elo 1,211 on the Artificial Analysis TTS Arena.

Veo 3.1

Veo 3.1

Google DeepMind's Veo 3.1 generates 4K video with native audio and is now free for every Google account at 10 clips per month via Google Vids.

Qwen3.6-Max-Preview

Qwen3.6-Max-Preview

Alibaba's first closed-weights flagship Qwen ships with a 256K context window, tops six agentic coding benchmarks, and ranks third on the Artificial Analysis Intelligence Index.

GPT-Rosalind

GPT-Rosalind

OpenAI's first domain-specific reasoning model for biology and drug discovery, launched April 16 2026 as a US-only research preview with a 0.751 BixBench score.

Kimi K2.6

Kimi K2.6

Moonshot AI's Kimi K2.6 is a 1T-parameter MoE with 32B active per token, 256K context, a 300-agent swarm running 4,000 coordinated steps, and the top SWE-Bench Pro score among open-weight models at 58.6%.

Arcee Trinity

Arcee Trinity

Arcee Trinity-Large-Thinking is a 400B sparse MoE open-source reasoning model that ranks #2 on PinchBench at $0.85/M output tokens, 28x cheaper than Claude Opus 4.6.

Qwen 3.6-35B-A3B

Qwen 3.6-35B-A3B

Alibaba's 35B sparse MoE with 3B active parameters delivers 73.4% SWE-bench Verified, multimodal vision and video, 256K context, and DeltaNet hybrid architecture under Apache 2.0.

Claude Opus 4.7

Claude Opus 4.7

Anthropic's latest flagship model ships with 3x higher resolution vision, a new xhigh effort level, task budgets for cost control, cyber safeguards, and 13% better coding performance at the same $5/$25 pricing.

Claude Mythos Preview

Claude Mythos Preview

Claude Mythos Preview is Anthropic's most capable model - restricted to 50 orgs via Project Glasswing, with 93.9% on SWE-bench Verified and thousands of autonomous zero-day discoveries.

Muse Spark

Muse Spark

Meta's first closed-source frontier model scores 52 on the Artificial Analysis Intelligence Index, leads on HealthBench Hard, and ships free at meta.ai - but has no public API yet.