AI Coding

GLM 5.2 Inside Claude Code: A 756-Billion-Parameter Open-Source Model That Rivals Opus for a Fifth of the Price.

GLM 5.2 Inside Claude Code: A 756-Billion-Parameter Open-Source Model That Rivals Opus for a Fifth of the Price AI Coding guide for Illawarra, Wollongong…

Daniel Fleuren2026-06-1811 min readDevelopers and technical teamsUpdated 2026-06-22

Written by

Daniel Fleuren

Founder, AI Kick Start. 20+ years enterprise IT

Updated 2026-06-22

AI Kick Start editorial image for GLM 5.2 Inside Claude Code: A 756-Billion-Parameter Open-Source Model That Rivals Opus for a Fifth of the Price.

Decision

Start narrow

Use the article to decide the smallest useful workflow worth testing before expanding the system.

Risk to watch

Hype drift

Avoid turning a practical adoption step into a broad transformation promise nobody can verify.

Proof to collect

Business signal

Write down the owner, data boundary, review point, and measurable outcome before the first build.

TL;DR

TL;DR: I switched Claude Code over to GLM 5.2 and ran it all day. It's a 756 billion parameter open source model you can route straight into the Claude Code harness for about five times cheaper than Opus, and for most of my knowledge work it held up fine. In this one I show you what it can build, where it beats Opus and where it doesn't, and exactly how to set it up so you can switch between models per project.

Key takeaways

The artificial intelligence landscape is shifting beneath our feet. For the past eighteen months, developers and knowledge workers have grown accustomed to a simple equation: the best models come from closed-source providers, cost a premium, and exist entirely on someone else's servers.
Herk pitted GLM 5.2 head-to-head against Claude Opus 4.8 on real-world tasks that working developers and content creators actually perform. The results paint a nuanced picture of a model that punches well above its price point.
Herk also tested creative freedom with an open-ended `/goal` prompt: "Get creative. Show me how good your design skills are and just build me whatever you want.
Perhaps the most impressive demonstration came from a complex research task. Herk instructed the model to use the STORM research skill to investigate open-source versus closed-source AI models, with the final deliverable being a comprehensive HTML report.
Throughout his day of testing, Herk maintained a clear-eyed perspective on GLM 5.2's limitations. His honest assessment: "GLM 5.2 is really solid and it's pretty quick for most tasks that don't require heavy reasoning.
Briefing: Briefing The artificial intelligence landscape is shifting beneath our feet.

Source video

Watch the source video

Original Video: "GLM 5.2 in Claude Code is Blowing My Mind". Open on YouTube

Table of contents

Briefing

The artificial intelligence landscape is shifting beneath our feet. For the past eighteen months, developers and knowledge workers have grown accustomed to a simple equation: the best models come from closed-source providers, cost a premium, and exist entirely on someone else's servers. That equation is being rewritten in real time. Enter GLM 5.2 - a 756-billion-parameter open-source model from Zhipu AI that slots directly into Claude Code's familiar harness, delivers performance within striking distance of Claude Opus 4.8, and does it all for roughly one-fifth of the cost.

Nate Herk, an AI automation educator with over 815,000 YouTube subscribers, recently spent an entire day hammering GLM 5.2 through Claude Code. His verdict? "It's incredible. It feels faster. It's significantly cheaper, and it just fits right into the Claude Code harness pretty well." In this article, we break down everything Herk discovered - what GLM 5.2 excels at, where it falls short, how it handles creative builds and deep research tasks, and exactly how you can configure Claude Code to run it yourself.

What GLM 5.2 Can Do: Design, Code, and Reason

Herk pitted GLM 5.2 head-to-head against Claude Opus 4.8 on real-world tasks that working developers and content creators actually perform. The results paint a nuanced picture of a model that punches well above its price point.

Front-End Design: Faster and Cheaper

One of the most striking comparisons involved front-end website design. Herk gave both models the same one-shot prompt to build a branded landing page. The results were remarkably similar - both produced professional-looking pages with matching brand styles, dynamic elements, hover animations, and call-to-action buttons. The only obvious tell? Opus's notorious fondness for a particular decorative font.

Where it gets interesting is speed and economics. GLM 5.2 completed its design in 3 minutes and 59 seconds. Opus 4.8 took 14 minutes and 59 seconds - nearly four times longer. GLM 5.2 also used fewer tokens, and with its cost-per-token approximately five times cheaper, the savings were substantial. As Herk put it: "These are both very solid for a one-shot prompt. Especially when you consider you're getting this output for like five times cheaper."

Coding Assignments: The Subtle Edge Cases

Herk also devised a coding homework assignment, created by a third party (Codex) to eliminate cross-contamination, and had both models solve it independently. Codex then judged the results. In this particular test, Opus 4.8 edged ahead - but only by a narrow margin. The differentiator was a subtle edge case: duplicate records with equivalent values like "true" versus "1" or "1" versus "1.0". Opus handled this nuance; GLM 5.2 missed it.

The takeaway here is important. GLM 5.2 is genuinely capable for the vast majority of coding tasks, but for problems requiring extreme precision or the handling of subtle edge cases, Opus still holds an advantage. The question becomes whether that marginal improvement justifies a 5x price increase - and for most day-to-day work, the answer is likely no.

AI Kick Start generated article visual for GLM 5.2 Inside Claude Code: A 756-Billion-Parameter Open-Source Model That Rivals Opus for a Fifth of the Price. — Generated AI Kick Start visual explaining the article's practical workflow, decision points, and implementation context.

Creative One-Shot Builds: Letting the Model Express Itself

Herk also tested creative freedom with an open-ended /goal prompt: "Get creative. Show me how good your design skills are and just build me whatever you want. Just make me an HTML document."

GLM 5.2: "The Anatomy of Attention"

GLM 5.2 responded with a beautifully crafted interactive page titled "The Anatomy of Attention." It featured animated stars, explanatory text about how language models process information, and an interactive demonstration of the classic sentence: "The animal didn't cross the street because it was too tired." Users could hover over words to see attention relationships mapped visually. The page also included relationship graphs and data visualisations breaking down how tokens get placed in vector space.

Opus 4.8: "The Life of a Death Star"

Opus produced a timeline narrative called "The Life of a Death Star" walking through the lifecycle of the iconic weapon. It too was polished, though Herk noted the reappearance of Opus's beloved decorative font.

The Verdict on Creative Output

Both outputs were impressive one-shot creations. Herk concluded that Opus wasn't demonstrably five times better - despite costing five times as much. However, speed differed: GLM 5.2 took 35 minutes, while Opus finished in 11. This reinforces a pattern: the more reasoning a task demands, the more Opus pulls ahead. For design-heavy tasks without deep logical analysis, GLM 5.2 is competitive and far more economical.

Storm Research: Multi-Agent Deep Research on a Budget

Perhaps the most impressive demonstration came from a complex research task. Herk instructed the model to use the STORM research skill to investigate open-source versus closed-source AI models, with the final deliverable being a comprehensive HTML report.

How STORM Research Works

STORM (Synthesis of Topic Outlines through Retrieval and Multi-perspective question asking) deploys multiple sub-agents with different personas to investigate a topic from multiple angles simultaneously. GLM 5.2 spun up several specialised agents, each approaching the question from a different professional lens.

The resulting report was genuinely thorough. It included a 60-second summary, five key findings with persona attribution, a "hidden connections" section surfacing non-obvious relationships, explicit assumptions, and actionable recommendations. The report even underwent a second pass where fresh agents reviewed and refined the output - hence the "V2" designation.

GLM 5.2 as a Research Workhorse

Herk concluded that GLM 5.2 is exceptionally well-suited for research tasks involving gathering data, synthesising opinions, pulling sources, and organising information into structured reports. Where he would still lean on Opus is for the analytical layer above that - interpreting what the research really means and figuring out how to apply it.

"It's not binary," Herk emphasised. "It's where in each process, what steps should I use what model for?"

When You Actually Need Opus: Honest Model Selection

Throughout his day of testing, Herk maintained a clear-eyed perspective on GLM 5.2's limitations. His honest assessment: "GLM 5.2 is really solid and it's pretty quick for most tasks that don't require heavy reasoning. Obviously, at the end of the day, Opus 4.8 is a better model. It's a closed source model."

The key question he posed is one every knowledge worker should ask themselves: how often do you actually need the full power of a frontier model like Opus? Herk's estimate: probably only 10-20% of tasks at most. The remaining 80% or more - routine coding, design drafts, content generation, research gathering, documentation - can likely be handled by more efficient models like GLM 5.2 or even Sonnet 3.7.

This understanding of which model to deploy per task will become a critical meta-skill as the AI ecosystem continues to fragment and specialise. The most effective users won't default to the most expensive model for every job. They'll build intuition about task complexity, match that to model capability, and route work accordingly.

Why Open Source Matters: Ownership in an Uncertain Market

Herk dedicated significant time to a point that extends beyond raw performance metrics: the fundamental value of open-source AI models in a market dominated by unprofitable closed-source providers.

The Sustainability Problem

Anthropic and OpenAI - the two leading closed-source AI companies - are not currently profitable businesses. Herk noted that Claude Max subscribers paying $200 per month can extract the equivalent of $8,000 worth of inference if they fully utilise their quotas. That is not a sustainable economic model, and it raises uncomfortable questions about the long-term availability and pricing of these services.

He drew a pointed parallel to the Fable situation - referring to a model or feature that was unexpectedly pulled from users. "That just tells you that we are renting something that could be taken away from us for, you know, out of nowhere." Open-source models offer a hedge against this volatility. If you can download and run a model locally, no provider can change pricing, remove features, or shut down access overnight.

The Infrastructure Reality

There is, of course, a catch. GLM 5.2 is enormous - 753 billion parameters - and most individuals don't have the hardware infrastructure to run it locally. That's where hosted open-source platforms like Z.ai come in. They provide API access to these massive models at prices that, while not free, are dramatically lower than closed-source equivalents. As Herk observed, "It's so much cheaper than Claude. So everyone is freaking out because it's basically yours. You're able to download it or get it for much cheaper."

Benchmarks and Pricing: The Numbers Behind the Hype

Let's talk specifics. GLM 5.2's pricing through Z.ai stands at $1.40 per million input tokens and $4.40 per million output tokens. Compare that to Opus 4.8 at $5.00 per million input tokens and $25.00 per million output tokens. For output-heavy tasks, the savings are enormous - roughly 5-6x cheaper.

Competitive Performance

Despite this dramatic price difference, GLM 5.2 benchmarks surprisingly close to the frontier models. On the Frontier S SWE (Software Engineering) benchmark, it actually outperformed GPT-5.5. Compared to earlier Claude versions, it beat Opus 4.7 in numerous evaluations and surpassed Sonnet 3.7 across a wide range of tasks. For a model you can effectively own rather than rent, these numbers are remarkable.

Subscription Tiers

Z.ai offers both pay-per-token pricing and subscription plans. Monthly plans range from approximately $16 to $144, with annual billing offering additional savings. Herk himself tested GLM 5.2 on the $64 monthly plan, running five simultaneous sessions for four to five hours straight. After that heavy usage, his 5-hour quota was slightly over halfway consumed, and his weekly quota sat at about 10%. For most users, even the mid-tier plan offers substantial capacity.

How to Set Up GLM 5.2 in Claude Code: A Step-by-Step Guide

Claude Code's model-agnostic architecture makes swapping the underlying AI engine as simple as changing a configuration file.

Step 1: Sign Up for Z.ai

Visit z.ai (opens in a new tab) and create an account. The platform offers a browser-based chat interface where you can test GLM 5.2 directly. Herk noted the model is particularly impressive at mini-games and front-end design tasks.

Step 2: Generate Your API Key

Navigate to the API console and create a new API key. You'll need this for the Claude Code configuration.

Step 3: Configure Claude Code

In your Claude Code project directory, locate or create a .claude/settings.local.json file. Add the following configuration (swapping in your actual Z.ai API key):

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "your-z-ai-api-key-here",
    "ANTHROPIC_API_KEY": "",
    "API_TIMEOUT_MS": "3000000",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-5.2",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-5.2",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-5.2",
    "ANTHROPIC_SMALL_FAST_MODEL": "glm-5.2",
    "CLAUDE_CODE_SUBAGENT_MODEL": "glm-5.2"
  }
}

Understanding the Configuration

What this configuration does is elegantly simple. Claude Code expects to communicate with Anthropic's API. By setting ANTHROPIC_BASE_URL to Z.ai's Anthropic-compatible endpoint, you're redirecting all API traffic to Z's servers instead. The ANTHROPIC_AUTH_TOKEN field receives your Z.ai API key, while ANTHROPIC_API_KEY is intentionally left blank. All model aliases - from Opus to Sonnet to Haiku - are mapped to glm-5.2, ensuring every request routes to the same model.

Per-Project Model Switching

Herk's recommended workflow involves maintaining separate project directories, each with its own settings.local.json. One directory might be configured for GLM 5.2 (cost-efficient general work), another for Opus 4.8 (heavy reasoning tasks), and another for Sonnet 3.7 (balanced middle ground). Simply open the appropriate directory in Claude Code to switch between models seamlessly.

The Bigger Picture: What This Means for the Future

Herk's final analysis touches on a trend that extends far beyond any single model comparison. The gap between open-source and closed-source AI is closing at an astonishing pace. What was exclusively the domain of billion-dollar research labs six months ago is now available to download, modify, and deploy on your own infrastructure.

He predicts a future where companies routinely run their own local models rather than depending on external providers. This shift has not gone unnoticed by the incumbents. Anthropic's investments in services and forward-deployed engineering teams, OpenAI's diversification into enterprise consulting - these are strategic responses to a world where the model itself may not be the durable competitive advantage.

"Right now there's a huge gap, but we see this gap closing super quickly and it's really fun to watch in real time," Herk observed. For developers, knowledge workers, and businesses, the implications are profound. Building fluency with open-source models today is an investment in operational resilience tomorrow.

Conclusion

GLM 5.2 inside Claude Code represents something genuinely significant: the moment when an open-source model became practical, economical, and capable enough to serve as a daily driver for serious knowledge work. Herk's extensive testing revealed a model that designs beautiful websites in a quarter of the time of its premium competitor, produces creative one-shot builds that rival the best, conducts multi-agent deep research with confidence, and does it all for about a fifth of the price.

It is not perfect. For tasks requiring the deepest reasoning, the most subtle edge-case detection, or the most complex analytical thinking, Claude Opus 4.8 remains the superior model. But the gap is narrowing, and the economics increasingly favour a hybrid approach - using GLM 5.2 for the 80% of work that doesn't require maximum reasoning power, and reserving Opus for the 20% that does.

The setup process is simple. The cost savings are substantial. The performance is genuinely competitive. And perhaps most importantly, the model is yours - not rented from a company burning through venture capital, but available to download, deploy, and control. In an industry defined by rapid change and corporate volatility, that ownership matters.

If you haven't experimented with routing alternative models through Claude Code's harness, now is the time. The future of AI isn't a single provider with a single model. It's an ecosystem of specialised tools, and GLM 5.2 just proved it belongs in your toolkit.

Helpful Resources

Official Platforms and APIs:

Z.ai (opens in a new tab) - Sign up for GLM 5.2 API access and explore the model in the browser-based chat interface
Z.ai Anthropic-Compatible API (opens in a new tab) - The API endpoint for routing GLM 5.2 into Claude Code
Ollama (opens in a new tab) - Platform for downloading and running open-source models locally (note: GLM 5.2 runs from cloud due to its 756B parameter size)

Claude Code and Configuration:

Claude Code Documentation (opens in a new tab) - Official documentation for Claude Code setup and configuration
Claude Code Settings Guide (opens in a new tab) - Learn about settings.local.json and environment variable configuration
Claude Code Subagent Documentation (opens in a new tab) - How Claude Code uses subagents for complex multi-step tasks

GLM Model Information:

Zhipu AI Official Website (opens in a new tab) - Developers of the GLM model family
GLM-5.2 Technical Documentation (opens in a new tab) - Technical specifications for the GLM-5.2 model
BigCode Benchmarks (opens in a new tab) - Software engineering benchmark comparisons

STORM Research Methodology:

STORM Research Paper (opens in a new tab) - "Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models"

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

Frequently asked questions

What is the practical takeaway from GLM 5.2 Inside Claude Code?

I switched Claude Code over to GLM 5.2 and ran it all day. For AI Kick Start readers, the key is to translate the idea into one agent workflow with clear inputs, review points, and measurable outcomes. The article should be treated as implementation guidance, not a substitute for workflow design.

Who should use GLM 5.2 Inside Claude Code guidance in AI Coding?

This guidance is most useful for Developers and technical teams who need to decide whether the topic changes tool selection, automation design, search visibility, data handling, training, or operational governance.

How should an Australian business implement GLM 5.2 Inside Claude Code?

Start small: define the agent boundary, give it test data, log its actions, and keep approval gates around customer or financial decisions. If the pilot improves successful task completion and review time, document the pattern, link it to the relevant service or resource page, and then decide whether it belongs in a production workflow.

What to do next

For GLM 5.2 Inside Claude Code, write down the single agent workflow this article should improve.
Collect real examples, edge cases, and source material before testing GLM 5.2 Inside Claude Code with any AI output.
Before implementing GLM 5.2 Inside Claude Code, add a human review checkpoint for quality, privacy, brand, or customer-impact risk.
Measure successful task completion, review time, fallback rate for GLM 5.2 Inside Claude Code before deciding whether to scale.
Connect GLM 5.2 Inside Claude Code to a related service, resource, or training path so readers have a clear next action.

Want help applying this? Explore AI agent design systems.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: GLM 5.2 Inside Claude Code: A 756-Billion-Parameter Open-Source Model That Rivals Opus for a Fifth of the Price

Read with ChatGPT Open Claude Search with AI Mode

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call