Edge ai

Google Gemma 4 Ships Four Open Models Under Apache 2.0

Google releases Gemma 4 with a 26B MoE, 31B Dense, and two edge variants under Apache 2.0 - claiming the highest intelligence-per-parameter of any open model.

Microsoft Phi-4 Reasoning: Small Model, Big Math

Microsoft's Phi-4 reasoning family delivers near-70B-class math performance in a 14B open-weight package, but the overthinking problem is real and the use case is narrower than the benchmarks suggest.

Nemotron 3 Nano 4B: NVIDIA Edge Model Runs on 8GB

NVIDIA's Nemotron 3 Nano 4B packs a Mamba-dominant hybrid architecture, 262K token context, and 95.4% on MATH500 into a model that fits an 8GB Jetson Orin Nano.

Tether CEO Teases AI 'True Breakthrough' This Week

Paolo Ardoino says Tether's AI team will release a 'true breakthrough' this week, building on QVAC - the company's on-device AI platform trained on 148 billion tokens with no cloud dependency.

IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard

IBM's new 1B-parameter speech model claims the top spot on the Open ASR Leaderboard while running on consumer hardware, beating Whisper Large V3 by 25% on word error rate.

Small Language Model Leaderboard: Best Under 10B

Rankings of the best small language models under 10 billion parameters, comparing Phi-4, Gemma 3, Qwen 3.5, and more across key benchmarks.

AMD Pushes P100 Embedded to 12 Cores and 80 TOPS

AMD expands its Ryzen AI Embedded P100 family with six new 8-to-12-core processors delivering 80 system TOPS, targeting industrial automation, robotics, and medical imaging.

PersonaPlex 7B Runs Full-Duplex Speech on a Mac

A developer ported NVIDIA's PersonaPlex 7B speech-to-speech model to native Swift using MLX, running full-duplex conversation on Apple Silicon with no cloud, no Python, and faster-than-real-time inference.

MacBook Neo: Apple's iPhone Chip Lands in a $599 Mac

Apple's cheapest Mac ever packs the A18 Pro iPhone chip with a 16-core Neural Engine - but its 60 GB/s memory bandwidth puts a hard ceiling on what local models you can actually run.

Apple's M5 Pro and Max Make 70B Models Portable

Apple launches M5 Pro and M5 Max MacBook Pros with Neural Accelerators in every GPU core, 128GB unified memory, and 614GB/s bandwidth - enough to run Llama 70B on a laptop.

AMD Ryzen AI 400 Brings 50 TOPS NPU to the Desktop

AMD launches the first desktop processors with Copilot+ qualified NPUs, putting 50 TOPS of on-device AI into AM5 desktops starting Q2 2026.

Qwen 3.5 Small Series Ships Four Models From 0.8B to 9B

Alibaba completes the Qwen 3.5 lineup with four small models - 0.8B, 2B, 4B, and 9B - all natively multimodal, 262K context, Apache 2.0. The 9B outperforms last-gen Qwen3-30B and beats GPT-5-Nano on vision benchmarks.

Edge ai

Google Analytics