
MiniMax M2.5 Review: Frontier Code at Bargain Cost
MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/20th the price - but a spike in hallucinations and a distillation controversy complicate the story.

MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/20th the price - but a spike in hallucinations and a distillation controversy complicate the story.

vLLM v0.17.0 adds FlashAttention 4, elastic expert parallelism for live MoE rescaling, full Qwen3.5 support, and a performance-mode flag, all in 699 commits from 272 contributors.

Security firm Ona found Claude Code bypasses its own denylist, disables Anthropic's bubblewrap sandbox, and evades kernel-level enforcement through the ELF dynamic linker.

The open-source AI agent framework crossed 250,000 GitHub stars in roughly 60 days, surpassing React's decade-long total. NVIDIA CEO Jensen Huang called it the most important software release ever.

A community fine-tune distills Claude Opus 4.6 chain-of-thought reasoning into Qwen3.5-27B via LoRA, racking up 4,000+ downloads in days. No benchmarks yet - but the approach raises familiar questions.

Community fine-tune that distills Claude Opus 4.6 reasoning into Qwen3.5-27B via LoRA. 28B parameters, Apache 2.0, no published benchmarks.

Comparing the Claude Opus reasoning-distilled Qwen3.5-27B against the base model - what chain-of-thought distillation adds and what it costs in context, multimodal, and reliability.

Microsoft releases Phi-4-reasoning-vision-15B - a 15B open-weight multimodal model trained on 240 GPUs in 4 days that competes with 100B+ parameter models on math, science, and GUI understanding.

OLMo Hybrid combines transformer attention with Gated DeltaNet to match OLMo 3 accuracy using 49% fewer tokens and 75% better throughput on long contexts. Fully open - weights, checkpoints, training code, and technical report.

A developer ported NVIDIA's PersonaPlex 7B speech-to-speech model to native Swift using MLX, running full-duplex conversation on Apple Silicon with no cloud, no Python, and faster-than-real-time inference.

A new open-source toolkit called OBLITERATUS can surgically remove refusal mechanisms from 116 open-weight LLMs using abliteration - no fine-tuning, no training data, just geometry.

Arc Institute and NVIDIA release Evo 2, a 40B-parameter open-source AI trained on 9.3 trillion nucleotides from every domain of life, with full weights, code, and training data.