
ERNIE 5.0
Baidu's ERNIE 5.0 combines 2.4 trillion parameters with native omni-modal design, landing at LMArena's top-10 globally and outpacing GPT-5 High on chart and document benchmarks.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

AI Benchmarks & Tools Analyst
James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure. His engineering background means he doesn't just read the spec sheet - he runs the benchmarks, profiles the latency, and checks whether the marketing claims hold up under real workloads.
He studied Computer Science at the University of Illinois at Urbana-Champaign, where he first got hooked on natural language processing during a senior research project on sentiment analysis. He later completed a certificate in data journalism from Northwestern's Medill School.
At Awesome Agents, James owns the leaderboards and tool comparison coverage. He maintains the site's benchmark tracking methodology and is the person who actually runs the numbers before publishing any ranking. He is also an open-source advocate and contributes to several projects in the LLM inference space.
Based in Chicago, IL.

Baidu's ERNIE 5.0 combines 2.4 trillion parameters with native omni-modal design, landing at LMArena's top-10 globally and outpacing GPT-5 High on chart and document benchmarks.

LG AI Research's first open-weight vision-language model packs 33B parameters, 262K context, and STEM scores above GPT-5-mini - but ships under a non-commercial license.

Alibaba's Qwen3.5-Omni takes text, images, audio, and video as input and streams both text and speech output in a single end-to-end model with a 256K context window.

OpenAI's GPT-5.4-Cyber is a cyber-permissive fine-tune of GPT-5.4 Thinking with binary reverse engineering, 88.23% on professional CTFs, and access gated through the Trusted Access for Cyber program.

Google's Gemini 3.1 Flash TTS ships in preview with 30 voices, 70-plus languages, 200-plus inline audio tags, and Elo 1,211 on the Artificial Analysis TTS Arena.

Google DeepMind's Veo 3.1 generates 4K video with native audio and is now free for every Google account at 10 clips per month via Google Vids.

Alibaba's first closed-weights flagship Qwen ships with a 256K context window, tops six agentic coding benchmarks, and ranks third on the Artificial Analysis Intelligence Index.

OpenAI's first domain-specific reasoning model for biology and drug discovery, launched April 16 2026 as a US-only research preview with a 0.751 BixBench score.

Moonshot AI's Kimi K2.6 is a 1T-parameter MoE with 32B active per token, 256K context, a 300-agent swarm running 4,000 coordinated steps, and the top SWE-Bench Pro score among open-weight models at 58.6%.

Current LLM API prices verified April 2026: Mistral Nemo at $0.02/MTok cheapest, DeepSeek V3.2 best value, Claude Opus 4.7 launches with a hidden 35% tokenizer cost increase.

Complete buying guide for AI home workstations in 2026 - pre-built machines and DIY builds for running local LLMs from 3B to 70B+ models, with benchmarks, part lists, and price-tier comparisons.

Normalized per-second pricing for Sora 2, Veo 3, Runway Gen-4, Kling 2.x, Luma Ray2, Seedance 2, and more - Kling and Haiper lead on cost.