News

Alibaba Drops Qwen 3.5: 397B Parameters of Open-Source Power

Alibaba releases Qwen 3.5, a 397B parameter open-source multimodal model with 256K context, Apache 2.0 license, and performance that tops Python coding and math reasoning benchmarks.

Alibaba Drops Qwen 3.5: 397B Parameters of Open-Source Power

Alibaba's Qwen team has released Qwen 3.5, a 397 billion parameter model that arrives with a bold claim: open-source AI can compete with, and in some cases beat, the best proprietary models in the world. Released under the Apache 2.0 license, Qwen 3.5 is free for commercial use and comes with native multimodal capabilities spanning text, images, audio, and video. It also happens to be 60% cheaper to run and delivers 8x better throughput than its predecessor, Qwen 3.

The Numbers

At 397 billion parameters, Qwen 3.5 is one of the largest open-source models available. But sheer size is not what makes it interesting. The model has been optimized extensively for inference efficiency, meaning that despite being larger than Qwen 3, it is significantly cheaper and faster to run.

The 60% cost reduction comes from a combination of architectural improvements and better quantization support. The 8x throughput improvement means the model can process substantially more tokens per second, which translates directly to faster response times for users and lower per-query costs for operators.

The 256K token context window is generous and practical. It allows the model to process long documents, extended conversations, or large codebases without truncation. For developers building applications that need to reason over substantial amounts of information, this is a meaningful capability.

Native Multimodality

One of the most significant aspects of Qwen 3.5 is its native multimodal architecture. Rather than bolting image or audio understanding onto a text-only model after the fact, Qwen 3.5 was trained from the ground up to handle multiple modalities.

The model can process and reason about text, images, audio, and video within a single conversation. You can show it a photograph and ask it to write code that generates a similar layout. You can feed it an audio clip and ask for a transcript with analysis. You can give it a video and ask it to summarize the key moments.

This native approach to multimodality tends to produce better results than add-on systems because the model develops a unified understanding of how different types of information relate to each other. A natively multimodal model does not just translate an image into a text description and then reason about the text. It reasons about the image directly, which preserves nuance and detail that would otherwise be lost.

Benchmark Performance

Qwen 3.5 achieves top marks on Python coding benchmarks, outperforming both proprietary and open-source alternatives in several key evaluations. On HumanEval+, a benchmark that tests code generation with rigorous test cases, Qwen 3.5 sets a new high for open-source models.

The model also excels at mathematical reasoning. On competition-level math problems, it demonstrates an ability to work through multi-step proofs and calculations that would challenge even advanced human solvers. Alibaba attributes this to a training curriculum that included extensive mathematical reasoning data and reinforcement learning from verification feedback.

General reasoning benchmarks tell a similar story. On MMLU-Pro and GPQA, Qwen 3.5 narrows or closes the gap with the best proprietary models. It does not lead across every benchmark, but it is competitive across the board, and in its strongest areas it is genuinely best-in-class.

The Apache 2.0 Advantage

The choice of Apache 2.0 licensing is significant. This is one of the most permissive open-source licenses available, allowing anyone to use, modify, and distribute the model for any purpose, including commercial applications. There are no usage restrictions, no registration requirements, and no royalty obligations.

For startups and enterprises alike, this removes a major barrier to adoption. Companies can build products on top of Qwen 3.5 without worrying about licensing costs or compliance complications. They can fine-tune it for specific use cases, deploy it on their own infrastructure, and maintain full control over their AI stack.

This matters more than ever as concerns about vendor lock-in and data privacy grow. Running a capable open-source model on your own servers means your data never leaves your infrastructure, a requirement for many industries including healthcare, finance, and government.

What This Means for the Open-Source Ecosystem

Qwen 3.5 is the latest evidence that the open-source AI ecosystem is thriving. Alongside Meta's Llama 4, DeepSeek's V3 series, and Z.ai's GLM-5, it represents a growing collection of freely available models that can match or beat proprietary alternatives.

The competitive dynamics are fascinating. Chinese AI companies, particularly Alibaba, DeepSeek, and Zhipu, have embraced open source as a strategic choice. By releasing their best models freely, they build community, attract talent, and establish their technology as the foundation for a wide range of applications. It is a different approach from the proprietary model pursued by OpenAI and (until recently) Google, and it is producing remarkable results.

For the average developer or business, this competition is unambiguously good news. The cost of accessing frontier AI capabilities has plummeted over the past year, and the range of options has expanded dramatically. Whether you prefer a proprietary API for simplicity or an open-source model for control, the quality is now comparable.

Getting Started

Qwen 3.5 is available for download from Hugging Face and ModelScope. Alibaba provides pre-quantized versions optimized for different hardware configurations, from multi-GPU server setups to more modest single-GPU deployments. The model is also available through several cloud inference providers for those who prefer an API-based approach.

Documentation and fine-tuning guides are available in the Qwen GitHub repository, along with example applications demonstrating the model's multimodal capabilities.

About the author Senior AI Editor & Investigative Journalist

Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem.