A year ago, DeepSeek was a curiosity — a scrappy Chinese lab that somehow built a model rivaling OpenAI’s o1 for a fraction of the cost. Markets panicked. Think pieces multiplied. Then the hype faded.

DeepSeek didn’t.

Today the Hangzhou-based lab released V4 in two variants — V4-Pro and V4-Flash — and the numbers are hard to ignore. This is the largest open-weights AI model ever released, and it’s trading blows with the best closed-source systems from OpenAI, Google, and Anthropic.

1.6 Trillion Parameters, MIT Licensed

V4-Pro packs 1.6 trillion parameters in a Mixture of Experts architecture, with 49 billion active per inference, pre-trained on 33 trillion tokens. V4-Flash is the lighter sibling: 284 billion total, 13 billion active, 32 trillion tokens. Both ship with a 1 million token context window as a default — not a premium tier.

Both are MIT licensed. Download them, modify them, ship them commercially. In a world where the most powerful AI sits behind API paywalls, that still matters enormously.

The Benchmarks: Coding Beast, Knowledge Gap

Coding is V4-Pro’s strongest card. LiveCodeBench: 93.5 (Gemini 91.7, Claude 88.8). Codeforces rating: 3206 (GPT-5.4: 3168). SWE-bench Verified: 80.6, essentially tied with Claude Opus 4.6’s 80.8. Independent testing from Vals AI confirmed it “significantly outperformed every other open-source system at generating code.”

Math is world-class. IMOAnswerBench: 89.8, demolishing Claude (75.3) and trailing GPT-5.4 (91.4) by a hair.

General knowledge is the gap. MMLU-Pro matches GPT-5.4 at 87.5 but trails Gemini-3.1-Pro (91.0). SimpleQA-Verified is the widest miss — 57.9 versus Gemini’s 75.6. DeepSeek’s own assessment is refreshingly honest: V4-Pro trails frontier models “by approximately 3 to 6 months.”

But when you see the pricing, that gap starts to look very different.

The Price War Nobody Can Win

V4-Flash: $0.14 per million input tokens, $0.28 output. V4-Pro: $1.74 input, $3.48 output.

For context: GPT-5.5, announced just yesterday, runs $5/$30. Claude Opus 4.7 is $5/$25. V4-Pro delivers comparable performance at roughly 70-77% less. V4-Flash is cheaper than OpenAI’s cheapest model while being dramatically more capable.

The efficiency gains aren’t magic — they’re engineering. DeepSeek’s new Sparse Attention architecture with token-wise compression means V4-Pro needs only 27% of the FLOPs and 10% of the KV cache compared to V3.2 in million-token contexts. Flash pushes further: 10% FLOPs, 7% KV cache.

Every time DeepSeek ships a model, the price floor drops. That’s great for builders. It’s existential for companies charging 10x more for comparable quality.

The Huawei Factor: Goodbye, Nvidia Dependency

The most geopolitically loaded detail: V4 runs natively on Huawei’s Ascend AI chips. Reuters reports Huawei worked closely with DeepSeek on integration across its full line of high-performance systems.

The timing is pointed. Yesterday the White House accused Chinese firms — DeepSeek named explicitly — of stealing U.S. AI labs’ intellectual property “on an industrial scale.” OpenAI and Anthropic have alleged DeepSeek improperly distilled their proprietary models. DeepSeek denies it.

But the hardware story is the real signal. Washington’s export control strategy assumed Chinese chip alternatives wouldn’t be good enough fast enough. V4 running on Ascend hardware suggests that assumption is crumbling.

What This Actually Means

The open-source gap is nearly closed. A year ago, there was a meaningful capability difference between open and closed models. V4-Pro essentially eliminates it for coding, math, and agentic tasks. The remaining gap — world knowledge — narrows with each release.

The pricing pressure is structural. OpenAI just launched GPT-5.5 at 36x the cost of V4-Flash on input tokens, for perhaps a 6-month capability lead. That premium gets harder to justify every quarter.

The hardware decoupling is real. If frontier-competitive models run on Chinese silicon, the strategic logic of export controls starts to erode.

Should You Use It?

If you’re building production applications and cost matters — and it always does — V4-Pro deserves serious evaluation. The SWE-bench and coding benchmarks say it’s ready for real work.

If you’re an AI tinkerer, V4-Flash is absurd value. Near-frontier performance at the bottom of the market. Available now through DeepSeek’s API, OpenRouter, or the official apps (Flash as “Instant Mode,” Pro as “Expert Mode”).

DeepSeek isn’t a curiosity anymore. It’s a consistent force — and the most disruptive player in AI isn’t in San Francisco.