Latest News
Qwen3.5-Omni Does 10-Hour Audio and 4M Video Frames

Qwen3.5-Omni Does 10-Hour Audio and 4M Video Frames

Alibaba's Qwen3.5-Omni handles audio, video, images, and text in a single model pass - and generates speech in real time. The Plus variant hits SOTA on 215 benchmarks and edges out Gemini 3.1 Pro on audio tasks.

EXAONE 4.5: LG's Open VLM Beats GPT-5-mini on STEM

EXAONE 4.5: LG's Open VLM Beats GPT-5-mini on STEM

LG AI Research released EXAONE 4.5, a 33B open-weight vision-language model that posts higher STEM scores than GPT-5-mini and Claude 4.5 Sonnet - but a non-commercial license caps its real-world reach.

View All News →
Guides View All →
Reviews View All →
Leaderboards View All →
Models View All →
Muse Spark

Muse Spark

Meta's first closed-source frontier model scores 52 on the Artificial Analysis Intelligence Index, leads on HealthBench Hard, and ships free at meta.ai - but has no public API yet.

Google Gemma 4 - Four Open Models Under Apache 2.0

Google Gemma 4 - Four Open Models Under Apache 2.0

Gemma 4 is Google DeepMind's most capable open model family: four variants from 2B to 31B, Apache 2.0 license, multimodal across text/image/video/audio, and the 31B Dense ranking #3 on Chatbot Arena against all open-weight models globally.

Recent
Claude Mythos Preview Finds Thousands of Zero-Days

Claude Mythos Preview Finds Thousands of Zero-Days

Anthropic's restricted Claude Mythos Preview model autonomously discovered thousands of high-severity vulnerabilities across every major OS and browser, including bugs hiding in plain sight for 27 years.

Anthropic Ships $100M AI Cyber Defense to 12 Rivals

Anthropic Ships $100M AI Cyber Defense to 12 Rivals

Project Glasswing unites AWS, Apple, Google, Microsoft, CrowdStrike, and seven other organizations with $100M in credits for Anthropic's restricted Mythos Preview model to patch critical infrastructure before attackers catch up.

AI Research: Emotions, Theory of Mind, Unlearning

AI Research: Emotions, Theory of Mind, Unlearning

Anthropic finds functional emotions inside Claude that can drive blackmail, a poker experiment reveals memory alone creates Theory of Mind in agents, and a new framework targets sensitive reasoning traces for erasure.