Articles Tagged "Edge AI"

Ministral 3 8B

Mistral AI's mid-tier open-weight edge model - 8B parameters, 256K context, Apache 2.0 license, built for agentic pipelines and cost-sensitive production workloads.

Google Gemma 4 QAT Fits Frontier AI in Under 1GB

Google DeepMind's new QAT checkpoints shrink the Gemma 4 E2B model to under 1GB, making serious on-device AI viable for phones and budget laptops.

NVIDIA Cosmos 3

NVIDIA Cosmos 3 is an open physical AI omnimodel with Mixture-of-Transformers architecture that natively handles text, images, video, sound, and robot actions in a single 16B or 64B model.

Ministral 3B

Mistral AI's smallest open-weight model - 3B parameters, 256K context, Apache 2.0 license, built for edge and cost-sensitive deployments.

LetinAR Scores $18M for the Optics Behind AI Glasses

South Korean startup LetinAR raises $18.5M to scale its PinTILT optical modules, which already power AI glasses and AR helmets as shipments hit 8.7 million units globally in 2025.

Chrome Installs 4 GB Gemini Nano Without Asking

Google Chrome silently installs a 4 GB Gemini Nano model file on user devices with no consent prompt and re-downloads it if you delete it.

Apple's Next CEO Is the Engineer Who Built Its Chips

Tim Cook becomes executive chairman and John Ternus, the hardware engineer behind Apple Silicon, takes the CEO role on September 1 - a clear bet that chips beat software in the AI race.

Edge and Mobile LLM Leaderboard 2026: Phi, Gemma, Qwen

Rankings of the best LLMs for on-device edge inference - phones, laptops without GPUs, Raspberry Pi, and Jetson - scored by quality benchmarks and real tokens/sec on iPhone, MacBook, and Raspberry Pi 5.

Google AI Edge Gallery Puts Gemma 4 on Your Phone

Google's AI Edge Gallery officially launched on the Play Store and App Store on April 9, running Gemma 4 E2B and E4B models fully offline on any phone from Android 12 or iOS 17 onward.

Google Gemma 4 Ships Four Open Models Under Apache 2.0

Google releases Gemma 4 with a 26B MoE, 31B Dense, and two edge variants under Apache 2.0 - claiming the highest intelligence-per-parameter of any open model.

Microsoft Phi-4 Reasoning: Small Model, Big Math

Microsoft's Phi-4 reasoning family delivers near-70B-class math performance in a 14B open-weight package, but the overthinking problem is real and the use case is narrower than the benchmarks suggest.

Nemotron 3 Nano 4B: NVIDIA Edge Model Runs on 8GB

NVIDIA's Nemotron 3 Nano 4B packs a Mamba-dominant hybrid architecture, 262K token context, and 95.4% on MATH500 into a model that fits an 8GB Jetson Orin Nano.