Articles Tagged "Open Source"

Google Gemma 3 27B

Google Gemma 3 27B is a 27B dense multimodal model supporting text and vision with a 128K context window, 140+ languages, and single-GPU deployment - the most capable open model at its size class.

Meta's Llama 4 Maverick packs 400B total parameters into a 128-expert MoE architecture with only 17B active per token, beating GPT-4o on Chatbot Arena while matching DeepSeek V3 on reasoning at half the active parameters.

Llama 4 Scout

Meta's Llama 4 Scout is a 109B-total, 17B-active MoE model with 16 experts and a 10M-token context window - the longest of any open-weight model - with native multimodal support for text and images.

Microsoft Phi-4

Microsoft's 14B dense transformer that consistently beats models 5x its size on MATH and GPQA, available under the MIT license for unrestricted commercial use.

Mistral Large 3

Mistral Large 3 is a 675B-parameter MoE model activating 41B per token with native multimodal support, a 256K context window, and Apache 2.0 licensing - Europe's first frontier-class open-weight model.

Mistral Small 3.2

Mistral Small 3.2 is a 24B dense model with strong function calling, multimodal vision, and 128K context under Apache 2.0 - optimized for production tool-use pipelines and EU-compliant deployments.

Nemotron 3 Nano 30B-A3B

NVIDIA's hybrid Mamba2+MoE model packs 31.6B total parameters but activates only 3.2B per token, delivering frontier-class reasoning with 3.3x the throughput of comparable models on a single H200 GPU.

Qwen3.5-122B-A10B vs DeepSeek V3.2: Efficiency vs Raw Power in Open-Weight AI

A benchmark-by-benchmark comparison of Qwen3.5-122B-A10B and DeepSeek V3.2 - the efficiency-optimized underdog versus the brute-force open-source heavyweight.

Qwen3.5-122B-A10B vs Llama 4 Maverick: The Efficiency Gap Nobody Expected

A data-driven comparison of Alibaba's Qwen3.5-122B-A10B and Meta's Llama 4 Maverick - two open-weight MoE models with radically different approaches to parameter efficiency and benchmark performance.

Qwen3.5-122B-A10B vs Mistral Large 3: When 4x More Parameters Buys You Less

A data-driven comparison of Qwen3.5-122B-A10B and Mistral Large 3 - two Apache 2.0 MoE models where the smaller one dominates text benchmarks despite a 4x active parameter disadvantage.

Qwen3.5-27B vs Gemma 3 27B: Same Parameter Count, Completely Different Models

A data-driven comparison of Alibaba's Qwen3.5-27B and Google's Gemma 3 27B - two 27B dense models that share a parameter count and almost nothing else.

Qwen3.5-27B vs Mistral Small 3.2: Apache 2.0 Heavyweights Go Head to Head

A data-driven comparison of Alibaba's Qwen3.5-27B and Mistral's Small 3.2 - two Apache 2.0 dense models in the 24-27B range with very different benchmark profiles and deployment strengths.

← Previous