Ai coding

Xiaomi MiMo-V2-Pro - Agentic 1T MoE Model

Xiaomi's MiMo-V2-Pro is a 1-trillion-parameter MoE model with 42B active params, 1M context, and agentic coding performance that rivals Claude Sonnet 4.6 at a fraction of the cost.

Leanstral Outperforms Claude Sonnet at Formal Code Proofs

Mistral's new open-source Lean 4 agent scores higher than Claude Sonnet on formal proofs at one-fifteenth the cost, raising the bar for trustworthy AI code generation.

Cursor's Composer 2 Is Kimi K2.5 With RL - And No Attribution

A developer leaked the model ID for Cursor's Composer 2: kimi-k2p5-rl-0317-s515-fast. Moonshot AI says Cursor violated the Kimi K2.5 license by not displaying attribution in a $2B ARR product.

Cursor Ships Composer 2 - Its First In-House Coding Model

Cursor launches Composer 2, its first in-house coding model trained via RL on long-horizon tasks, scoring 73.7 on SWE-bench Multilingual at $0.50/M input tokens.

Claude Sonnet 4.6: Mid-Tier Model, Flagship Results

Anthropic's mid-tier model matches Opus 4.6 on computer use, leads all models on office productivity tasks, and costs five times less than the flagship at $3/$15 per million tokens.

Codex vs Claude Code: Agentic Coding Tools Compared

A head-to-head comparison of OpenAI Codex and Anthropic Claude Code covering benchmarks, pricing, features, and real-world performance for agentic coding workflows.

METR: Half of SWE-Bench Passes Fail Real Code Review

METR found maintainers would reject roughly half of AI PRs that pass SWE-bench automated grading, with a 24-point gap that suggests benchmark scores substantially overstate production readiness.

Switching from GitHub Copilot to Cursor

A developer's guide to moving from GitHub Copilot to Cursor IDE, covering settings migration, feature mapping, agent workflows, and pricing differences.

Augment Code Intent Review: Orchestration Over Code

Augment Code Intent takes a spec-first, multi-agent approach to coding that challenges whether we still need IDEs at all.

Amazon Mandates Senior Approval for AI-Assisted Code

After a six-hour shopping outage and multiple AI-linked incidents, Amazon now requires junior and mid-level engineers to get senior sign-off before deploying AI-assisted code changes.

Anthropic Ships Multi-Agent Code Review for PRs

Anthropic's new Code Review dispatches parallel AI agents on every pull request to find bugs, rank them by severity, and filter false positives - at $15-25 per review.

MiniMax M2.5 Review: Frontier Code at Bargain Cost

MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/20th the price - but a spike in hallucinations and a distillation controversy complicate the story.

← Previous

Ai coding

Google Analytics