DeepSeek-V4 Preview Goes Open Source

4/24/2026
DeepSeek has officially launched and open-sourced DeepSeek-V4 Preview, introducing a two-model series built around long-context efficiency and agentic capability. The release includes DeepSeek-V4-Pro and DeepSeek-V4-Flash. According to the announcement, DeepSeek-V4-Pro has 1.6 trillion total parameters with 49 billion active parameters, while DeepSeek-V4-Flash has 284 billion total parameters with 13 billion active parameters. Both models support a context length of 1 million tokens. The main message of the release is that million-token context is becoming a default part of DeepSeek’s official services. The company says 1M context length is now standard across its official DeepSeek services, opening the door to longer documents, multi-step reasoning, agentic workflows, and large cross-document tasks. In the technical report, DeepSeek says this efficiency comes from a hybrid attention design that combines Compressed Sparse Attention and Heavily Compressed Attention, along with other architecture and optimization upgrades. The two models are positioned for different needs. DeepSeek-V4-Pro is the higher-capability option and is available through Expert Mode. DeepSeek-V4-Flash is presented as the faster, more efficient, and more economical option through Instant Mode. Both models are open source, both support API service, and both are available for web or app use. DeepSeek says users can try the models at chat.deepseek.com, access the updated API, and download open weights from its Hugging Face collection. DeepSeek’s benchmark materials focus heavily on knowledge, reasoning, and agentic capabilities. In the reported results, DeepSeek-V4-Pro-Max scored 57.9 on SimpleQA Verified, 37.7 on HLE, 90.2 on Apex Shortlist, 3206 on Codeforces, 80.6 on SWE Verified, 67.9 on Terminal Bench 2.0, and 51.8 on Toolathlon. The same comparison set includes Claude Opus 4.6, GPT-5.4, and Gemini 3.1 Pro. DeepSeek describes V4-Pro as an open-source model with strong results in agentic coding, broad world knowledge, and reasoning benchmarks. DeepSeek-V4-Flash is introduced as the smaller and more cost-conscious member of the series. The company says its reasoning capabilities closely approach V4-Pro, while its performance is on par with V4-Pro on simple agent tasks. Because of its smaller parameter scale, DeepSeek positions it as a faster and more economical API option. The pricing table lists deepseek-v4-pro at $0.145 for cache-hit input, $1.74 for cache-miss input, and $3.48 for output. For deepseek-v4-flash, the listed prices are $0.028 for cache-hit input, $0.14 for cache-miss input, and $0.28 for output. On the developer side, DeepSeek says users can keep the same base_url and simply update the model name to deepseek-v4-pro or deepseek-v4-flash. Both models support OpenAI ChatCompletions and Anthropic APIs, and both support dual Thinking and Non-Thinking modes. The company also notes that deepseek-chat and deepseek-reasoner will be fully retired and become inaccessible after July 24, 2026, at 15:59 UTC. For now, those models route to deepseek-v4-flash in non-thinking or thinking mode. DeepSeek also highlights agent integrations as part of the launch. The company says DeepSeek-V4 is integrated with AI agents such as Claude Code, OpenClaw, and OpenCode, and is already being used for in-house agentic coding at DeepSeek. The announcement also includes a sample PDF generated by DeepSeek-V4-Pro, presented as an example of real-world output. The release closes with a reminder from DeepSeek to rely only on its official accounts for company news. Statements from other channels, the company says, do not reflect its views. DeepSeek frames the V4 Preview release as part of its continued long-term work toward AGI, with cost-effective million-token context as one of the central updates in this launch.