
Ministral 3B
Mistral AI's smallest open-weight model - 3B parameters, 256K context, Apache 2.0 license, built for edge and cost-sensitive deployments.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Mistral AI's smallest open-weight model - 3B parameters, 256K context, Apache 2.0 license, built for edge and cost-sensitive deployments.

South Korean startup LetinAR raises $18.5M to scale its PinTILT optical modules, which already power AI glasses and AR helmets as shipments hit 8.7 million units globally in 2025.

Google Chrome silently installs a 4 GB Gemini Nano model file on user devices with no consent prompt and re-downloads it if you delete it.

Tim Cook becomes executive chairman and John Ternus, the hardware engineer behind Apple Silicon, takes the CEO role on September 1 - a clear bet that chips beat software in the AI race.

Rankings of the best LLMs for on-device edge inference - phones, laptops without GPUs, Raspberry Pi, and Jetson - scored by quality benchmarks and real tokens/sec on iPhone, MacBook, and Raspberry Pi 5.

Google's AI Edge Gallery officially launched on the Play Store and App Store on April 9, running Gemma 4 E2B and E4B models fully offline on any phone from Android 12 or iOS 17 onward.

Google releases Gemma 4 with a 26B MoE, 31B Dense, and two edge variants under Apache 2.0 - claiming the highest intelligence-per-parameter of any open model.

Microsoft's Phi-4 reasoning family delivers near-70B-class math performance in a 14B open-weight package, but the overthinking problem is real and the use case is narrower than the benchmarks suggest.

NVIDIA's Nemotron 3 Nano 4B packs a Mamba-dominant hybrid architecture, 262K token context, and 95.4% on MATH500 into a model that fits an 8GB Jetson Orin Nano.

Paolo Ardoino says Tether's AI team will release a 'true breakthrough' this week, building on QVAC - the company's on-device AI platform trained on 148 billion tokens with no cloud dependency.

IBM's new 1B-parameter speech model claims the top spot on the Open ASR Leaderboard while running on consumer hardware, beating Whisper Large V3 by 25% on word error rate.

Rankings of the best small language models under 10 billion parameters, comparing Phi-4, Gemma 3, Qwen 3.5, and more across key benchmarks.