Articles Tagged "Edge AI"

Tether CEO Teases AI 'True Breakthrough' This Week

Paolo Ardoino says Tether's AI team will release a 'true breakthrough' this week, building on QVAC - the company's on-device AI platform trained on 148 billion tokens with no cloud dependency.

IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard

IBM's new 1B-parameter speech model claims the top spot on the Open ASR Leaderboard while running on consumer hardware, beating Whisper Large V3 by 25% on word error rate.

Small Language Model Leaderboard: Best Under 10B

Rankings of the best small language models under 10 billion parameters, comparing Phi-4, Gemma 3, Qwen 3.5, and more across key benchmarks.

AMD Pushes P100 Embedded to 12 Cores and 80 TOPS

AMD expands its Ryzen AI Embedded P100 family with six new 8-to-12-core processors delivering 80 system TOPS, targeting industrial automation, robotics, and medical imaging.

PersonaPlex 7B Runs Full-Duplex Speech on a Mac

A developer ported NVIDIA's PersonaPlex 7B speech-to-speech model to native Swift using MLX, running full-duplex conversation on Apple Silicon with no cloud, no Python, and faster-than-real-time inference.

MacBook Neo: Apple's iPhone Chip Lands in a $599 Mac

Apple's cheapest Mac ever packs the A18 Pro iPhone chip with a 16-core Neural Engine - but its 60 GB/s memory bandwidth puts a hard ceiling on what local models you can actually run.

Apple's M5 Pro and Max Make 70B Models Portable

Apple launches M5 Pro and M5 Max MacBook Pros with Neural Accelerators in every GPU core, 128GB unified memory, and 614GB/s bandwidth - enough to run Llama 70B on a laptop.

AMD Ryzen AI 400 Brings 50 TOPS NPU to the Desktop

AMD launches the first desktop processors with Copilot+ qualified NPUs, putting 50 TOPS of on-device AI into AM5 desktops starting Q2 2026.

Qwen 3.5 Small Series Ships Four Models From 0.8B to 9B

Alibaba completes the Qwen 3.5 lineup with four small models - 0.8B, 2B, 4B, and 9B - all natively multimodal, 262K context, Apache 2.0. The 9B outperforms last-gen Qwen3-30B and beats GPT-5-Nano on vision benchmarks.

Qwen3.5-0.8B

Qwen3.5-0.8B is the smallest natively multimodal model in the Qwen 3.5 family - 0.8B parameters handling text, images, and video with 262K context. MathVista 62.2, OCRBench 74.5. Apache 2.0.

Qwen3.5-2B

Qwen3.5-2B is a 2B dense multimodal model with 262K context, thinking mode, and native vision including video understanding. OCRBench 84.5, VideoMME 75.6. Apache 2.0 licensed.

Qwen3.5-4B

Qwen3.5-4B is a 4B dense multimodal model that matches Qwen3-30B on MMLU-Pro and beats GPT-5-Nano on vision benchmarks. Runs on 8GB VRAM, Apache 2.0 licensed, 262K-1M context.

← Previous