Articles Tagged "Open Source"

Kimi K2.7-Code

Moonshot AI's Kimi K2.7-Code is a 1T-parameter open-weight MoE coding model with mandatory thinking mode, 256K context, and 30% fewer reasoning tokens than K2.6.

Kimi K2.7-Code - Moonshot's Open-Weight Coding Leap

Moonshot AI ships Kimi K2.7-Code with 30% fewer reasoning tokens and a 21.8% gain on its own coding benchmarks, but the model still trails Claude Opus 4.8 on most tests in the same table.

Mistral Seeks €3B Round, Valuation Hits €20B

Mistral AI is in talks to raise €3 billion at a €20 billion valuation, nearly doubling its September 2025 price tag in under a year and cementing its status as Europe's most valuable AI company.

Best AI Coding IDEs 2026: Cursor, Windsurf, Kiro, Zed, Copilot

A benchmark-driven comparison of the five leading AI coding IDEs in 2026, covering pricing, agent capabilities, and who each one is actually built for.

Google DiffusionGemma: Parallel LLM Hits 1,100 t/s

Google DeepMind open-sources DiffusionGemma, a 26B MoE model that generates 256 tokens per denoising pass instead of one at a time, reaching 1,100 tokens per second on a single H100.

DiffusionGemma 26B is Google DeepMind's open-weight discrete diffusion language model that generates 256 tokens in parallel, reaching 1,100+ tokens/sec on H100 - roughly 4x faster than autoregressive models of the same size.

OpenCode Hits 8M Users, a Year from a Toronto Meetup

OpenCode reaches 8 million monthly users and 172K GitHub stars in one year, displacing Claude Code as the most-starred open-source coding agent.

Ministral 3 8B

Mistral AI's mid-tier open-weight edge model - 8B parameters, 256K context, Apache 2.0 license, built for agentic pipelines and cost-sensitive production workloads.

Devstral 2

Mistral's open-weight coding agent model - 123B parameters, 256K context window, 72.2% on SWE-bench Verified, priced at $0.40/M input tokens.

Ministral 3 14B

Mistral AI's largest Ministral 3 model - 14B parameters, 256K context, Apache 2.0 license, multimodal, built for local deployment and agentic workflows.

MiniMax M3 Makes 1M Context Viable With Sparse Attention

MiniMax M3 uses sparse attention to cut long-context inference cost 20x, topping GPT-5.5 on coding benchmarks at a fraction of the price.

Google Gemma 4 QAT Fits Frontier AI in Under 1GB

Google DeepMind's new QAT checkpoints shrink the Gemma 4 E2B model to under 1GB, making serious on-device AI viable for phones and budget laptops.

← Previous