
DeepSeek Nears $7.4B Close With Tencent and CATL
DeepSeek's maiden external funding round is nearing completion at up to $59B valuation, with Tencent and EV battery giant CATL as the biggest outside investors.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

MiniMax M3 uses sparse attention to cut long-context inference cost 20x, topping GPT-5.5 on coding benchmarks at a fraction of the price.

Blackstone-backed AirTrunk pledges $30 billion and 5GW of AI data center capacity in India by 2030 - more than triple the country's current total installed base.

OpenAI's new Lockdown Mode cuts the network exits that prompt injection attacks use to steal data from ChatGPT - but won't stop malicious instructions from entering the model in the first place.

Tracking AI supply-chain attacks, agent exploits, prompt injection, model leaks, and the real-world incidents shaping AI security today.

Google DeepMind's new QAT checkpoints shrink the Gemma 4 E2B model to under 1GB, making serious on-device AI viable for phones and budget laptops.

The Trump administration is in talks with OpenAI about donating equity to a US sovereign-style fund, a deal that would make American taxpayers co-owners of the most valuable AI startup on Earth.

Google will pay SpaceX $920 million per month for 110,000 NVIDIA GPUs at Colossus 1, citing unexpected demand for its Gemini Enterprise agent platform.

NVIDIA's 550B Nemotron 3 Ultra, released June 4, tops the US open-weight leaderboard with a hybrid Mamba-Transformer MoE architecture and 300-plus tokens per second throughput.

OpenAI's Dreaming V3 replaces ChatGPT's flat memory with a hierarchical relational system, kicking off a four-way race for AI personalization dominance.

A beginner's guide to using AI tools like Fathom, Otter.ai, Zoom AI, and Google Meet's Gemini to automatically capture meeting notes and follow-up tasks.

Learn how to use ChatGPT, Perplexity, Gemini, and Amazon's AI assistant to research products, compare prices, and spot fake reviews before you buy.

A practical beginner's guide to using AI tools to write a stronger resume, craft tailored cover letters, and prepare confidently for job interviews.

MiniMax M3 arrives as the first open-weight model to combine frontier coding, 1M-token context, and native multimodality - at a fraction of proprietary pricing - but every benchmark figure is self-reported and the weights weren't even shipped at launch.

Claude Opus 4.8 sets new highs on SWE-bench Pro and long-context tasks while a 4x improvement in code flaw detection may matter more than any benchmark number.

Google's Antigravity 2.0 rewrites the platform from a browser IDE into a five-surface agent suite. The architecture is ambitious, the launch was a mess.

Current rankings of the best AI image generation models, including GPT Image 2, Nano Banana 2, Recraft V4.1, HiDream-O1-Image, FLUX 2, Midjourney v8.1, and Ideogram 3.0, scored on human preference, text rendering, and photorealism.

Rankings of the best AI models and agent frameworks on the GAIA benchmark, which tests real-world multi-step tasks requiring web browsing, tool use, and multi-hop reasoning.

Rankings of AI models by cost efficiency in May 2026, comparing performance per dollar across frontier and budget models. Updated with DeepSeek V4, GPT-5.5, and Kimi K2.6.

NVIDIA's 550B open-weight MoE model with 55B active parameters, hybrid Mamba-Transformer architecture, and 1M token context - the top-scoring US open model on the Artificial Analysis Intelligence Index.

MiniMax M3 is an open-weight frontier model with a 1M-token context window, native multimodal input, and strong agentic coding at $0.60/M input tokens.

Meta's Llama 3.3 70B Instruct matches Llama 3.1 405B on instruction following and math while running at 4-5x lower cost, with the lowest hallucination rate of any open-weight model on the Vectara summarization leaderboard.

Nvidia commits a gigawatt of Vera Rubin chips to Mira Murati's startup, a supply the FT values at tens of billions of dollars, alongside an undisclosed cash investment.

NVIDIA Nemotron 3 Super is a 120B-parameter open model with 12B active at inference, combining Mamba-2, LatentMoE, and Multi-Token Prediction for agentic workloads with a 1M token context window.

NVIDIA releases Nemotron 3 Super, a 120B-parameter open model with only 12B active at inference, combining Mamba-2 and Transformer layers for agentic AI workloads with a 1M token context window.

Meta published a four-generation MTIA silicon roadmap delivering chips every six months through 2027, with compute scaling 25x from MTIA 300 to MTIA 500.

Anthropic's Claude Code CLI suffered an OAuth authentication outage on March 11, locking developers out mid-work while the Claude API remained operational.

Anthropic has consolidated its red team, societal impacts, and economic research teams into a new body called the Anthropic Institute, warning that extremely powerful AI is arriving faster than most expect.

How to migrate your RAG pipeline from LangChain to LlamaIndex, with side-by-side code examples for document loading, indexing, querying, and agents.

How to move your vector search workload from Pinecone to PostgreSQL with pgvector, including schema mapping, data migration, and cost savings of up to 75%.

A practical guide to switching from OpenAI's chat completions to Anthropic's Messages API, covering endpoint mapping, tool use differences, and pricing.

Luma Agents coordinates text, image, video, and audio from a single brief using the Uni-1 unified model - a genuine architectural leap, with some real rough edges still showing.

IBM's new 1B-parameter speech model claims the top spot on the Open ASR Leaderboard while running on consumer hardware, beating Whisper Large V3 by 25% on word error rate.

A Hugging Face survey of 16 open-source reinforcement learning libraries finds the entire ecosystem has converged on async disaggregated training to fix a single brutal bottleneck: GPU idle time during long rollouts.