Analysis
For most of the past few years, the big Western labs treated open-weights models as a side project for researchers and weekend tinkerers. That view did not survive the first six months of 2026.
In the space of one half-year, a string of releases turned open weights from a hobbyist corner into the part of the market everyone now watches. Most of the pressure came from Chinese labs. Meta kept pushing Llama. The result is a set of freely downloadable models that a business can run on its own hardware, in its own region, under its own rules, and get work done that used to require a paid subscription to OpenAI, Google, or Anthropic.
For an Australian team, the practical question is simple. If a model you can download and host yourself does 90% of what the expensive API does, at a fraction of the cost, why keep paying the premium? That question is now sitting on a lot of desks.
Here is what actually landed, and what to make of it.
The Chinese Open-Weights Surge
The standout feature of this wave is where it came from. Several of the major releases came from Chinese labs: MiniMax with M3, Zhipu (Z.ai) with GLM-5.2, DeepSeek with its updated flagship, and Moonshot AI with Kimi K2.7-Code. That says something about both China's technical depth and a deliberate bet on open weights as a way to compete.
The reasoning is not hard to follow. Chinese labs start at a disadvantage on closed APIs. US export controls limit access to the best GPUs. English-language training data is harder to come by. And plenty of Western firms are wary about sending sensitive data to a Chinese-run service. Open weights sidestep all three. Once the weights are out, anyone can run the model on their own machines, anywhere, and where it came from stops mattering for what it can do.
Pricing has been the other lever. MiniMax M3 launched at $0.30 input / $1.20 output per million tokens (the-decoder, June 2026). DeepSeek's March flagship, its V4 model, despite some early reports floating a "V3.5" name and a $0.15/$0.60 price that turned out to match GPT-4o-mini rather than anything DeepSeek shipped, came in around $0.30/$0.50 per million tokens with a 1M-token context window (DeepSeek API pricing guide, 2026). GLM-5.2's exact API price is still unsettled, with reported OpenRouter listings ranging from roughly $1.20/$3.20 to $1.40/$4.40 per million tokens (Simon Willison, June 2026). Even at those numbers, the open field has set a floor that forces the paid providers to defend their margins.
The idea that Google's Gemini 3.5 Flash was priced as a direct undercut in response does not hold up. Flash went generally available on 19 May 2026 at roughly $1.50/$9.00 per million tokens, about three times more than the model it replaced, not a discount (TechTimes, May 2026). So the pricing pressure is real, but at least one of the responses ran the other way.

Capability Convergence
The trend that matters most is the one that is harder to put a single number on: open models are closing the gap on quality. Reports suggest open weights used to trail the paid models by a wide margin on standard benchmarks, and that the gap has shrunk to single digits on many tasks. The exact historical figures are not something we can pin to a clean source, so treat the size of the old gap as rough rather than precise. The direction, though, is showing up in the results.
MiniMax M3 reports 59.0% on SWE-bench Pro, which puts it just above GPT-5.5's 58.6% and within ten points of Claude Opus 4.8's 69.2% (morphllm coding leaderboard, June 2026). Worth a caveat: these are vendor-harness numbers, and vendor harnesses tend to run higher than standardised, independent leaderboards, so read them as self-reported rather than neutral. GLM-5.2, a large mixture-of-experts model, with sources putting its total parameters at either 753B or 744B with about 40B active, posts competitive results across the board and currently tops Artificial Analysis's open-weights intelligence ranking (Simon Willison, June 2026). It has been described in some coverage as the largest open model ever, but that claim does not survive scrutiny: Kimi K2.7-Code, at around a trillion total parameters, is bigger, so GLM-5.2's lead is about capability, not size.
Kimi K2.7-Code itself is a coding specialist from Moonshot AI with a 256K-token context, released open-source on 12 June 2026 (the April Moonshot release was the earlier K2.6, not this one). Moonshot published its own coding benchmarks, Kimi Code Bench v2 and the like, rather than a public SWE-bench score, so any SWE-bench figure circulating for it is unconfirmed (MarkTechPost, June 2026).
What this convergence means in practice: the genuine advantage of the paid models is shrinking back to the very top end, the hardest slice of tasks where the frontier still pulls ahead. For most everyday work, the open models are good enough now. And good enough, at a fraction of the price and running on your own infrastructure, is a strong argument.
Implications for the Industry
A few things follow from all this.
Pricing pressure on the paid APIs is not going away. OpenAI, Google, and Anthropic either match the open field on price, hard, given they carry heavier cost structures, or they justify a premium through better quality, reliability, and the enterprise features that businesses actually pay for.
The basis of competition is also moving. As the raw models start to look interchangeable, the value shifts to what surrounds them: the tooling, the hosting, the support, and the industry-specific applications built on top. The model becomes the commodity; the system around it becomes the product.
And the regulatory picture gets messier. You cannot put open weights back in the box. Once a model is released, it spreads across the internet no matter what any government would prefer. That sits awkwardly against any plan to control AI through export rules or release restrictions, and it is a tension that is not going to resolve quietly.


