Back to news

Model Review

GLM-5.2 vs Kimi K2.7-Code: Chinese models compared.

Zhipu AI's GLM-5.2 ($0.80/$2.40, 51.4% SWE-bench Pro) vs Moonshot's Kimi K2.7-Code ($0.50/$2.00, 56.8% SWE-bench Pro). Two leading Chinese open-weights models face off.

AI Kick Start editorial image for GLM-5.2 vs Kimi K2.7-Code: Chinese models compared.

Decision

Shortlist

Score tools by workflow fit, data handling, owner readiness, and cost at scale before buying seats.

Risk to watch

Shelfware

A capable tool still fails if nobody owns the workflow or checks whether it is used weekly.

Proof to collect

Pilot score

Run one real task through each shortlisted tool and record quality, time saved, and support burden.

TL;DR

TL;DR: China's two strongest open-weights models squared off in mid-2026. [GLM-5.2](https://the-decoder.com/zhipu-ais-glm-5-2-closes-in-on-closed-source-leaders-in-coding-marathons/) comes from Zhipu AI; [Kimi K2.7-Code](https://huggingface.co/moonshotai/Kimi-K2.7-Code) comes from Moonshot AI. Both are real, both shipped in June 2026, and both are worth a look if your team wants a capable model without a Western vendor lock-in. The catch: a lot of the head-to-head numbers passed around in comparisons like this one do not match what the labs and independent trackers actually published. Where the figures are unconfirmed or wrong, we say so below.

Key takeaways

  • GLM-5.2 vs Kimi K2.7-Code: Chinese models compared:
  • Analysis: If you run a business in Australia and you have been keeping half an eye on AI tooling, here is the short version.
  • Head-to-head benchmarks: SWE-bench Pro: 51.4%: 56.8%: +5.4 pts (Kimi) MMLU: 85.2%: 85.7%: +0.5 pts (Kimi) Context window: 256K: 256K: , Price (input): $0.80 / 1M: $0.50 / 1M: Kimi cheaper Price (output): $2.40 / 1M: $2.00 / 1M: Kimi cheaper Parameters: 753B (MoE): Not disclosed: , A warning before you act on this table: most of it does not hold up.
  • Where Kimi K2.7-Code wins: **Software engineering.** The name is honest about the focus.
  • Where GLM-5.2 wins: **Knowledge capacity and context.** GLM-5.2 carries 753 billion total parameters ([ForkLog](https://forklog.com/en/zhipu-ai-launches-glm-5-2-with-1-million-token-context/)).

GLM-5.2 vs Kimi K2.7-Code: Chinese models compared

Analysis

If you run a business in Australia and you have been keeping half an eye on AI tooling, here is the short version. The best coding models no longer all come from San Francisco. Two Chinese labs, Zhipu AI and Moonshot AI, now ship open-weights models that go toe to toe with the closed-source names you already know, and they do it at a fraction of the price.

That matters for a practical reason. Open weights mean you, or a vendor you trust, can run the model yourself instead of renting it through a foreign API. For a finance team handling client data or a dev shop nervous about where its code goes, that is not a small thing.

The trouble starts when you try to pick a winner. Comparison tables for GLM-5.2 and Kimi K2.7-Code are floating around the internet, and many of them, including the one this article was built from, get the headline numbers wrong. Some are off by a few points. At least one has the result backwards. So treat any clean-looking "X beats Y by 5.4 points" table with suspicion, including ours, and check the figures against the labs.

What follows keeps every number from the original comparison so you can see what was claimed, then sets it against what the sources actually report.

Head-to-head benchmarks

MetricGLM-5.2Kimi K2.7-CodeDelta
SWE-bench Pro51.4%56.8%+5.4 pts (Kimi)
MMLU85.2%85.7%+0.5 pts (Kimi)
Context window256K256K,
Price (input)$0.80 / 1M$0.50 / 1MKimi cheaper
Price (output)$2.40 / 1M$2.00 / 1MKimi cheaper
Parameters753B (MoE)Not disclosed,

A warning before you act on this table: most of it does not hold up. We have kept the original figures so you can see what was circulating, but here is what the sources actually say, row by row.

  • SWE-bench Pro. The 51.4% / 56.8% split, and the idea that Kimi leads by 5.4 points, is not supported. Real reporting puts GLM-5.2 at 62.1 on SWE-bench Pro, the top open-source result on that benchmark, while Moonshot's own number for Kimi K2.7-Code is 58.6 (VentureBeat). In other words, the direction is reversed: on sourced figures GLM-5.2 is ahead, not behind. And Moonshot's 58.6 was vendor-reported, with practitioners flagging that the benchmarks did not fully check out (VentureBeat).
  • MMLU. The 85.2% / 85.7% figures appear to be invented. No reporting we could find gives these MMLU numbers for either model (LLM-Stats). Treat them as unconfirmed.
  • Context window. This row is wrong for GLM-5.2. Kimi K2.7-Code does land around 256K. But GLM-5.2's headline feature is a 1 million token context window, not 256K (Pandaily). So this is not a tie; GLM-5.2 holds a large advantage on context.
  • Price. Neither price row matches any provider rate we could verify. First-party Z.ai pricing for GLM-5.2 runs closer to $1.40 input / $4.40 output per 1M tokens (WaveSpeed), and OpenRouter lists Kimi K2.7-Code at $0.74 input / $3.50 output (OpenRouter). The $0.80/$2.40 and $0.50/$2.00 figures above are unconfirmed.
  • Parameters. GLM-5.2's 753B (MoE) checks out (ForkLog). Kimi K2.7-Code is not undisclosed, though: its specs are public at roughly 1 trillion total MoE parameters with 32B active (Hugging Face). That makes Kimi the larger model by total parameter count, not the smaller one.

Where Kimi K2.7-Code wins

Software engineering. The name is honest about the focus. Moonshot built K2.7-Code as a coding-first model for end-to-end programming and agentic work, and it reports a +21.8% gain on Kimi Code Bench v2 over the previous K2.6 (MarkTechPost). So it is genuinely a strong coder. What we cannot stand behind is the original claim that it beats GLM-5.2 on SWE-bench Pro by 5.4 points. On the sourced figures, GLM-5.2 scores higher there. If coding is your priority, both are contenders, and you should test them on your own codebase rather than trust a single benchmark line.

Price. The original framing had Kimi as the cheaper option at $0.50/$2.00. The verified rates tell a less tidy story: Kimi sits around $0.74 input / $3.50 output (OpenRouter) versus GLM-5.2's roughly $1.40 / $4.40 (WaveSpeed). So Kimi does come out cheaper on real provider pricing, just not at the numbers first stated. At volume, that gap is worth modelling against your actual token usage.

Where GLM-5.2 wins

Knowledge capacity and context. GLM-5.2 carries 753 billion total parameters (ForkLog). The original article leaned on that as a representational-capacity edge, but the comparison is muddier than it looked, because Kimi K2.7-Code is the larger model on paper at about 1T total / 32B active (Hugging Face). The clearer GLM-5.2 advantage is its 1 million token context window (Pandaily), roughly four times Kimi's. If your work involves feeding large documents, long codebases, or whole knowledge bases into a single prompt, that is a real, verifiable point in GLM-5.2's favour.

Chinese language depth. Both models are strong in Mandarin. The original claim that GLM-5.2 has a marginal edge on classical Chinese, Chinese legal terminology, and Chinese-specific knowledge benchmarks is unconfirmed; we found no sourced benchmark data comparing the two on those tasks (Artificial Analysis). Take it as an unverified editorial impression, not a measured result.

Verdict

Here is the honest answer. The original take crowned Kimi K2.7-Code as the better all-rounder, and built that case on cheaper pricing, a coding win, and near-equal knowledge. But that case rested on numbers that do not survive contact with the sources. On verified figures, GLM-5.2 leads on SWE-bench Pro and on context length, Kimi is the larger model rather than the smaller one, and the price gap is narrower than claimed (though Kimi is still cheaper).

So we are not declaring a winner. Both are credible open-weights models from serious labs, and the right choice depends on what you actually need: GLM-5.2 if long context and a top SWE-bench Pro result matter most, Kimi K2.7-Code if you want a coding-focused model at the lower price. The benchmark wars between these two are noisy and, in places, disputed even by practitioners (VentureBeat). Run both against your own work before you commit.

Winner: too close, and too contested, to call from the published benchmarks. Test both on your own tasks.

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

What to do next

  1. Write the job-to-be-done before looking at another product.
  2. Score each shortlisted tool for workflow fit, data handling, cost, and owner readiness.
  3. Run one small pilot and remove anything the team does not use weekly.

Want help applying this? Explore the AI tools directory.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: GLM-5.2 vs Kimi K2.7-Code: Chinese models compared

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call