
Google Gemma 4 QAT Fits Frontier AI in Under 1GB
Google DeepMind's new QAT checkpoints shrink the Gemma 4 E2B model to under 1GB, making serious on-device AI viable for phones and budget laptops.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Google DeepMind's new QAT checkpoints shrink the Gemma 4 E2B model to under 1GB, making serious on-device AI viable for phones and budget laptops.

The Trump administration is in talks with OpenAI about donating equity to a US sovereign-style fund, a deal that would make American taxpayers co-owners of the most valuable AI startup on Earth.

Google will pay SpaceX $920 million per month for 110,000 NVIDIA GPUs at Colossus 1, citing unexpected demand for its Gemini Enterprise agent platform.

NVIDIA's 550B Nemotron 3 Ultra, released June 4, tops the US open-weight leaderboard with a hybrid Mamba-Transformer MoE architecture and 300-plus tokens per second throughput.

OpenAI's Dreaming V3 replaces ChatGPT's flat memory with a hierarchical relational system, kicking off a four-way race for AI personalization dominance.

NVIDIA's Agent Toolkit lands 110+ verified skills on GitHub covering robotics, autonomous vehicles, vision AI, and industrial systems - turning complex physical AI pipelines into single agent calls.

A bipartisan Congressional bill would freeze state AI laws for three years and require frontier developers to publish catastrophic risk plans, submit to federal audits, and face $1M daily fines.

NVIDIA's Dynamo Snapshot uses CRIU and cuda-checkpoint to freeze and restore GPU inference containers in seconds, cutting Kubernetes cold-start times by up to 21x for large models.

Meta is building tent-style rapid deployment structures at its Ohio Prometheus facility, cutting construction timelines roughly in half and bypassing grid delays with off-grid gas power.

Florida becomes the first US state to hold an AI CEO personally liable, filing an 83-page complaint accusing OpenAI and Sam Altman of hiding ChatGPT's dangers while racing for market share.

New open-source inference engine for Apple Silicon benchmarks up to 2.6x faster than Ollama, supports 66 model aliases, and drops in as an OpenAI-compatible server on any Mac.

Lovable's multiyear Google Cloud pact quintuples its compute footprint and bundles access to both Claude and Gemini - a bet that enterprise credibility is the next growth act.