
LLM API Pricing Comparison - April 2026
Current LLM API prices verified April 2026: Mistral Nemo at $0.02/MTok cheapest, DeepSeek V3.2 best value, Claude Opus 4.7 launches with a hidden 35% tokenizer cost increase.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Current LLM API prices verified April 2026: Mistral Nemo at $0.02/MTok cheapest, DeepSeek V3.2 best value, Claude Opus 4.7 launches with a hidden 35% tokenizer cost increase.

Z.AI updated its GLM Coding Plan usage policy. Non-coding requests now trigger aggressive throttling, and three violations mean a permanent ban - which explains the wave of 1302 and 1303 rate-limit errors users have been hitting this week.

Normalized per-second pricing for Sora 2, Veo 3, Runway Gen-4, Kling 2.x, Luma Ray2, Seedance 2, and more - Kling and Haiper lead on cost.

True cost breakdown of commercial agent frameworks and platforms - LangGraph, CrewAI, AutoGen, E2B, Modal, Fly.io, and more at 1k, 100k, and 1M runs, including LLM passthrough costs.

Raw GPU rental rates across 20+ providers normalized to per-GPU-hour - H100, H200, A100, L40S, RTX 4090, on-demand vs spot vs reserved, with hidden costs and value-tier recommendations.

Per-image cost comparison for vision APIs across OpenAI, Anthropic, Google, Mistral, Meta Llama 4, xAI, Amazon Nova, and open-source models - with cost-at-scale math for OCR and document processing workloads.

Per-query pricing for search APIs used in AI agents and RAG pipelines - Brave, Tavily, Exa, SerpAPI, Serper, Perplexity Sonar, You.com, Jina Reader, Firecrawl, and more compared at 10k, 100k, and 1M queries.

Per-minute and per-1000-minute transcription API pricing across OpenAI Whisper, Deepgram Nova-3, AssemblyAI, Google Chirp 2, Azure, AWS Transcribe, Groq, ElevenLabs Scribe, and more.

Normalized per-1M-character and per-hour TTS pricing across ElevenLabs, OpenAI, Google, Azure, Amazon Polly, Play.ht, Cartesia, Deepgram Aura, WellSaid, and more.

A 755-point Hacker News thread and a 139-upvote GitHub issue document Claude Code Pro Max 5x users exhausting their quota in 1.5 hours. An independent investigation with 1,500 logged API calls reveals the math behind the drain.

Embedding API costs compared for OpenAI, Voyage AI, Cohere, Google, Mistral, and AWS - voyage-4-lite and OpenAI 3-small tie at $0.02/MTok for best budget value.

A developer used an HTTP proxy to capture full API requests across four Claude Code versions and found that v2.1.100 adds roughly 20,000 invisible server-side tokens to every request - inflating billing by 40% with no user visibility.