Chinese AI models: GLM-5.2, Kimi K2.7, Qwen 3, DeepSeek
China's AI labs have stopped chasing the West and started setting their own pace. Four open-weights models doing the rounds in June 2026, GLM-5.2 from Zhipu AI, Kimi K2.7-Code from Moonshot, Alibaba's Qwen line, and DeepSeek, show how far that shift has come. Between them, they offer some of the best price-to-capability on the market right now.
A note before we go further: the exact model names and benchmark numbers in this piece come from the original reporting, and several don't line up with what vendors and independent testers have since published. Where a figure looks off, we've flagged it. Read the specifics as a snapshot of the conversation in mid-2026, not as settled fact, and check the linked sources before you act on any single number.
If you run a business and you've been pricing AI features off Western closed models, the headline is simple. The cheap, capable open-weights options coming out of China have changed what "expensive" means. DeepSeek's pricing alone has reset what teams expect to pay, and it's pulling the rest of the market down with it.
You don't need to read Mandarin or self-host anything to feel the effect. These models show up on the same API marketplaces you already use, and the price gap against the big-name closed models is the kind that shows up on your monthly bill. The rest of this article walks through each model, what it's good at, and where the published numbers should be treated with caution.
The four models
| Model | Lab | SWE-bench Pro | MMLU | Context | Price (In/Out) |
|---|---|---|---|---|---|
| GLM-5.2 | Zhipu AI | 51.4% | 85.2% | 256K | $0.80 / $2.40 |
| Kimi K2.7-Code | Moonshot | 56.8% | 85.7% | 256K | $0.50 / $2.00 |
| Qwen 3 | Alibaba | 46.2% | 84.6% | 128K | $0.40 / $1.20 |
| DeepSeek V3.5 | DeepSeek | 52.4% | 85.8% | 1M | $0.15 / $0.60 |
A caution on the table above: independent reporting contradicts several of these figures. There is no model called "DeepSeek V3.5", DeepSeek's current line is V4-Flash and V4-Pro, both with 1M context, priced around $0.14/$0.28 and $0.435/$0.87 respectively (DeepSeek API Docs). GLM-5.2's real context window is reportedly 1M, not 256K (Pandaily), and its SWE-bench Pro and MMLU scores are reported higher than shown here, near 62.1 and 88 (StableLearn). "Qwen 3" is an outdated label for Alibaba's current 3.5/3.6/3.7 generation, which supports up to 1M context, not 128K (QwenLM/Qwen3). Treat the per-row numbers as the original article's claims, not verified specs.
DeepSeek V3.5: The value leader
On a global scale, DeepSeek is the one that turns heads. The article puts it at $0.15/$0.60 with a 1M context and 52.4% SWE-bench Pro, which would undercut Western alternatives roughly tenfold. Those exact figures sit against a fabricated model name, there is no "DeepSeek V3.5", but the underlying story is real: DeepSeek's actual V4-Flash runs a 1M context at about $0.14 input and $0.28 output (DeepSeek API Docs), still far below the closed-model field. DeepSeek's pricing reads like a play for adoption rather than margin, and it's dragging the whole market's price expectations down with it.
Kimi K2.7-Code: The coding specialist
Moonshot built Kimi K2.7-Code for software engineering, and the article rates it the best Chinese model for the job at 56.8% SWE-bench Pro. That number is worth a caveat: no independent third party has published SWE-bench Pro results for K2.7, so every score floating around traces back to Moonshot's own runs (MarkTechPost). The model itself is confirmed, released 12 June 2026, with a 256K context window (explainX), and it lands somewhere near Western mid-tier closed models while staying fully open. The $0.50/$2.00 pricing in the table is unconfirmed. Either way, Moonshot has picked a clear lane: coding, not everything-at-once.
GLM-5.2: The parameter giant
Zhipu AI's GLM-5.2 carries the largest disclosed parameter count of any Chinese open model, 753 billion total in a mixture-of-experts design, released under an MIT license (StableLearn). The article reports strong general knowledge (85.2% MMLU) but only mid-tier coding (51.4% SWE-bench Pro), and prices it as the most expensive of the four. Both benchmark figures look understated against what's since been reported, closer to 88 MMLU and 62.1 SWE-bench Pro, and the model's real context window is 1M, not the 256K shown in the table (Pandaily). On the specs that hold up, GLM-5.2 is the premium pick inside the Chinese ecosystem.
Qwen 3: The multilingual gateway
Alibaba's Qwen comes out weakest on the raw benchmarks in this article, 46.2% SWE-bench Pro and a 128K context, but strongest on multilingual coverage, with best-in-class Mandarin and other Asian languages. Two cautions here. "Qwen 3" is a dated name for the current 3.5/3.6/3.7 line, and the real models support up to 1M context with much higher coding scores than quoted; reported SWE-bench Verified for Qwen3.6-Plus sits around 78.8% (QwenLM/Qwen3). So the "weakest" framing doesn't hold for the up-to-date version. At $0.40/$1.20, it's still the cheapest sensible entry point for Asian-language work.
Collective impact
The bigger pattern matters more than any single spec. Chinese open-weights labs are pushing global prices down, and the directional claim is well supported: DeepSeek V4-Flash undercuts Gemini 3.5 Flash by roughly 10x on input and 30x on output (TechBullion).
One claim in the original needs correcting. It says DeepSeek's pricing forced Google's hand on Gemini 3.5 Flash. The reporting points the other way: Gemini 3.5 Flash, launched 20 May 2026, actually raised prices about 3x over the model it replaced, to roughly $1.50 input and $9 output (XDA). So treat the "DeepSeek forced Google down" line as unconfirmed at best.
The frontier point holds up better. MiniMax M3, released 1 June 2026 by Shanghai-based MiniMax, is reported at 59.0% SWE-bench Pro with a 1M context and input pricing around $0.30/M (VentureBeat). Worth noting: that 59.0% was largely run on MiniMax's own infrastructure with agent scaffolding and hasn't been independently verified, and the $1.20 output figure in the article isn't separately confirmed. Still, it suggests Chinese labs are showing up at the top end, not just at the bottom of the price list.
Verdict
Chinese AI models have moved past "good for the price." They're just good. DeepSeek and Kimi K2.7-Code in particular belong on the shortlist for any serious AI strategy, wherever you're based. The open-weights bet these labs have made, publishing the weights instead of locking them behind an API, gives them a structural edge that closed Western models can't easily copy. Just verify the exact model name, pricing, and benchmark before you commit; in this corner of the market the specifics move fast and the published numbers don't always agree.



