Qwen3.6-35B-A3B Is Now Open Source

4/17/2026

Qwen has announced the open-source release of Qwen3.6-35B-A3B, extending the Qwen3.6 lineup after the launch of Qwen3.6-Plus. The company describes the model as a sparse mixture-of-experts system with 35 billion total parameters but only 3 billion active parameters at runtime. In the release, Qwen frames that efficiency as a core part of the story, arguing that the model delivers strong agentic coding performance despite its relatively small active footprint. Qwen3.6-35B-A3B is now live on Qwen Studio, released with open weights for the community, and Qwen says it can be downloaded from Hugging Face and ModelScope, with Alibaba Cloud Model Studio API access as Qwen3.6-Flash marked as coming soon.

⚡ Meet Qwen3.6-35B-A3B：Now Open-Source！🚀🚀

A sparse MoE model, 35B total params, 3B active. Apache 2.0 license.

🔥 Agentic coding on par with models 10x its active size
📷 Strong multimodal perception and reasoning ability
🧠 Multimodal thinking + non-thinking modes… pic.twitter.com/UMiChPaLid
— Qwen (@Alibaba_Qwen) April 16, 2026

The announcement centers heavily on coding and agent use cases. Qwen says the model surpasses its direct predecessor, Qwen3.5-35B-A3B, by a wide margin and also competes with much larger dense models such as Qwen3.5-27B and Gemma4-31B. The benchmark charts released with the post show Qwen3.6-35B-A3B reaching 73.4 on SWE-bench Verified, 67.2 on SWE-bench Multilingual, 49.5 on SWE-bench Pro, and 51.5 on Terminal-Bench 2.0. Additional coding and agent scores shown in the materials include 52.6 on QwenClawBench, 29.4 on NL2Repo, 37.0 on MCPMark, and an Elo rating of 1397 on QwenWebBench. Qwen also highlights broader agent results such as 67.2 on TAU3-Bench, 62.8 on MCP-Atlas, and 60.1 on WideSearch.

LM Performance：Qwen3.6-35B-A3B outperforms the dense 27B-param Qwen3.5-27B on several key coding benchmarks and dramatically surpasses its direct predecessor Qwen3.5-35B-A3B, especially on agentic coding and reasoning tasks. pic.twitter.com/PyXDNruoy2
— Qwen (@Alibaba_Qwen) April 16, 2026

Beyond coding, the company presents Qwen3.6-35B-A3B as a more general-purpose model that still supports both multimodal thinking and non-thinking modes. In the language and reasoning tables shared in the release, the model posts 85.2 on MMLU-Pro, 93.3 on MMLU-Redux, 90.0 on C-Eval, and 86.0 on GPQA. On STEM and problem-solving tasks, the posted results include 80.4 on LiveCodeBench v6, 83.6 on HMMT Feb 26, 78.9 on IMOAnswerBench, and 92.7 on AIME26. Together, those numbers are used in the announcement to argue that the model is not limited to coding-only scenarios, even though agentic coding remains the lead message of the release.

VLM Performance：Qwen3.6 is natively multimodal, and Qwen3.6-35B-A3B showcases perception and multimodal reasoning capabilities that far exceed what its size would suggest, with only around 3 billion activated parameters. Across most vision-language benchmarks, its performance… pic.twitter.com/nOVBNlVfzW
— Qwen (@Alibaba_Qwen) April 16, 2026

Multimodality is the other major theme in the launch. Qwen says Qwen3.6 is natively multimodal, and that Qwen3.6-35B-A3B shows perception and multimodal reasoning capabilities well beyond what its size would suggest. The company states that across most vision-language benchmarks, the model matches Claude Sonnet 4.5 and surpasses it on several tasks. The charts shared in the announcement show Qwen3.6-35B-A3B scoring 81.7 on MMMU, 75.3 on MMMU-Pro, 85.3 on RealWorldQA, 92.8 on MMBench EN-DEV v1.1, 89.9 on OmniDocBench1.5, 81.9 on CC-OCR, and 92.7 on AI2D_TEST. Qwen gives special emphasis to spatial intelligence, where the model reaches 92.0 on RefCOCO and 50.8 on ODInW13. The same release also includes video benchmarks such as 86.6 on VideoMME, 83.7 on VideoMMMU, and 86.2 on MLVU. Taken together, the release positions Qwen3.6-35B-A3B as an open-source model built around agentic coding, multimodal understanding, and efficient inference. Qwen’s framing stays focused on what the model can deliver with only about 3 billion active parameters, while the benchmark set is used to show that the model is being pitched not only for code generation, but also for reasoning, document understanding, visual tasks, and broader agent workflows.