GLM-5.2 Developer Guide: Z.ai's 1M-Context Coding Model

Official Sources

Resource	Link
Z.AI Developer Documentation	docs.z.ai/devpack/overview
Claude Code Setup	docs.z.ai/devpack/tool/claude
GLM-5 Overview	z.ai/blog/glm-5
OpenRouter GLM-5	openrouter.ai/z-ai/glm-5
Z.AI Twitter	@Zai_org

Z.ai released GLM-5.2 on June 13, 2026, making it available immediately to every GLM Coding Plan subscriber. This is the company's new flagship coding model - and the headline feature is a 1,000,000-token context window that actually works for large codebase navigation.

The model ships with two thinking-effort levels (High and Max), 131,072 output tokens per response, and MIT-licensed open weights arriving within the week. No benchmarks have been published yet - Z.ai shipped first, benchmarks later. Here is what developers need to know to start testing it.

Last updated: June 15, 2026

What GLM-5.2 brings to the table

GLM-5.2 is a step function jump from GLM-5.1 in context capacity. The usable context window expands from 200,000 tokens to 1,000,000 tokens - roughly five times larger. For coding work, this means you can load entire monorepo directories without hitting the context ceiling that forces aggressive summarization.

The output limit also increased to 131,072 tokens per response, which matters for long refactors, multi-file diffs, and migration scripts that need to output complete files.

Z.ai added two thinking-effort levels:

High effort: Faster responses, good for routine coding tasks and quick iterations
Max effort: Deeper reasoning passes before returning an answer, recommended for complex refactors and multi-step agentic work

For coding tasks specifically, Z.ai recommends Max effort. The extra thinking time pays off on tasks that benefit from planning and verification passes - the same pattern Anthropic documented with Fable 5's effort levels.

GLM Coding Plan pricing

Z.ai offers the GLM Coding Plan in three tiers, billed quarterly:

Plan	Price	Quarterly	Notes
Lite	~$10/month	$30/quarter	Entry point for solo developers
Pro	~$30/month	$90/quarter	Higher quotas, recommended for active use
Max	~$80/month	$240/quarter	Highest limits, team-friendly

Q2 2026 discounts bring these down slightly ($27, $81, $216 per quarter). Earlier promotional pricing around $3/month no longer exists - Z.ai removed first-purchase discounts in February 2026.

The Coding Plan exposes an Anthropic-compatible endpoint. If you have built agents or workflows against Claude's API, they work with a base-URL and API key swap - no code changes required beyond environment variables.

Setup with Claude Code

The GLM Coding Plan maps Claude Code's model tiers to GLM models by default:

Opus and Sonnet requests route to GLM-4.7
Haiku requests route to GLM-4.5-Air

To use GLM-5.2 instead of the defaults, you need to override the model environment variables.

Environment variable setup

Add these to your shell config (.bashrc, .zshrc, or equivalent):

export ANTHROPIC_BASE_URL="https://open.z.ai/api/paas/v4/"
export ANTHROPIC_API_KEY="your-glm-coding-plan-key"
export ANTHROPIC_DEFAULT_SONNET_MODEL="glm-5.2[1m]"
export ANTHROPIC_DEFAULT_OPUS_MODEL="glm-5.2[1m]"
export CLAUDE_CODE_AUTO_COMPACT_WINDOW=1000000

The [1m] suffix enables the 1M context variant. Without it, you get the standard context window. The CLAUDE_CODE_AUTO_COMPACT_WINDOW setting tells Claude Code to use the full 1M context before triggering automatic compaction.

Settings file setup

Alternatively, add or update these values in ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://open.z.ai/api/paas/v4/",
    "ANTHROPIC_API_KEY": "your-glm-coding-plan-key",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-4.5-air",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-5.2[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-5.2[1m]",
    "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "1000000"
  }
}

If Claude Code reports that the model with the [1m] suffix does not exist, upgrade to the latest Claude Code version and try again.

Verification

After setup, start a new Claude Code session and ask it to identify which model it is running. The response should confirm GLM-5.2. You can also check by running a task that would exceed the standard 200K context - if it works without compaction warnings, the 1M context is active.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

OpenRouter Fusion Makes Model Panels Real. Use Them Like Escalation, Not Autopilot

Jun 15, 2026 • 8 min read

Kimi K2.7-Code Developer Guide: The Open-Source Coding Model Worth Running

Jun 14, 2026 • 8 min read

Agent Workspaces Need Filesystem Contracts

Jun 13, 2026 • 8 min read

Best Claude Model Now That Fable 5 Is Disabled (Mythos vs Opus vs GPT-5.5)

Jun 13, 2026 • 6 min read

What to test before benchmarks arrive

Z.ai shipped GLM-5.2 without published benchmarks. That is not necessarily a red flag - it matches Z.ai's historical pattern of releasing models quickly and letting the community validate them. But it does mean you should test against your actual workloads before committing.

Tests worth running

Large codebase navigation. Load a 500K+ token directory into context and ask the model to explain architecture, find specific patterns, or trace a call path. This is where the 1M context should shine - or fail visibly if the attention degrades at scale.

Multi-file refactors. Ask for a refactor that touches 10+ files. Check whether the output maintains consistency across files and respects your existing patterns. GLM-5.1 was competitive with Sonnet 4.5 on this; GLM-5.2 should be stronger.

Long-horizon agentic tasks. Run a multi-step task with tool calls - file reads, writes, searches. Track whether the model stays on task across steps or drifts. Use Max effort for this.

Comparison with your current model. Run the same task through GLM-5.2 and your current default (Fable 5, Opus 4.8, GPT-5.5, whatever you use). Note completion quality, speed, and whether the context window matters for that task.

Known considerations

The Anthropic-compatible endpoint means Claude Code's MCP servers, skills, and hooks all work without modification
Latency may differ from Anthropic's infrastructure - test response times for your typical prompts
No tool-use benchmarks have been published yet, so agentic reliability is an open question

How GLM-5.2 compares

Direct GLM-5.2 vs Fable 5 vs GPT-5.5 benchmarks do not exist yet. What we know from GLM-5.1 benchmarks as a baseline:

Model	SWE-bench Pro	Code Arena Elo
Claude Fable 5	80.3%	Not rated
GPT-5.5	58.6%	Not rated
GLM-5.1	58.4%	1530

GLM-5.1 was competitive with GPT-5.5 on SWE-bench Pro. GLM-5.2 should improve on that - the question is by how much.

The context window advantage is real and measurable. Fable 5 offers a 1M+ context window as well, but at $10/$50 per million tokens on the API. GLM Coding Plan pricing is substantially lower for comparable context capacity.

When to use GLM-5.2

Good fit:

Cost-sensitive teams that need large context for monorepo work
Claude Code users who want to keep their existing workflow but reduce API costs
Developers testing open-weight models before the MIT weights release
Projects where the Anthropic-compatible API matters for existing integrations

Wait or skip:

If you need verified benchmark performance before switching
If you depend on tool-use reliability that has not been publicly validated yet
If you are already on Fable 5 through a subscription plan and the cost is not an issue
If you need the model available outside the Coding Plan (standalone API is still rolling out)

What comes next

Z.ai stated that the MIT-licensed open weights release will follow within a week of the Coding Plan launch. That means self-hosting and local inference options are imminent. For developers who want to run GLM-5.2 on their own infrastructure, the timeline is short.

The standalone API is also rolling out - currently the model is only accessible through the Coding Plan's Anthropic-compatible endpoint. Once the API launches, OpenRouter and other aggregators will likely add it to their routing options.

No benchmark release timeline has been announced. Z.ai's pattern is to let community testing generate the numbers rather than publishing self-reported scores.

FAQ

What is GLM-5.2?

GLM-5.2 is Z.ai's newest flagship coding model, released June 13, 2026. It features a 1,000,000-token context window, 131,072 output tokens per response, two thinking-effort levels (High and Max), and MIT-licensed open weights arriving soon.

How much does GLM-5.2 cost?

GLM-5.2 is available through the GLM Coding Plan. Pricing is approximately $10/month (Lite), $30/month (Pro), or $80/month (Max), billed quarterly. The standalone API with per-token pricing is still rolling out.

Can I use GLM-5.2 with Claude Code?

Yes. The GLM Coding Plan exposes an Anthropic-compatible endpoint. Set the ANTHROPIC_BASE_URL to https://open.z.ai/api/paas/v4/, configure your API key, and override the model environment variables to use glm-5.2[1m]. See the setup section above for full details.

How does GLM-5.2 compare to Claude Fable 5?

No direct benchmarks exist yet. GLM-5.1 scored 58.4% on SWE-bench Pro compared to Fable 5's 80.3%. GLM-5.2 should improve on 5.1 but the magnitude is unknown. The main advantage is cost: GLM Coding Plan pricing is substantially lower than Fable 5 API rates at $10/$50 per million tokens.

When will GLM-5.2 open weights be available?

Z.ai stated the MIT-licensed open weights will release within a week of the June 13 Coding Plan launch. That puts the expected release around June 20, 2026.

Does GLM-5.2 work with MCP servers and Claude Code skills?

Yes. Because the GLM Coding Plan uses an Anthropic-compatible endpoint, your existing MCP server configurations, skills, and hooks work without modification. You only need to change the base URL and API key.

What is the difference between High and Max effort levels?

High effort returns faster responses suitable for routine coding tasks. Max effort runs deeper reasoning passes before returning an answer, recommended for complex refactors and multi-step agentic work. Z.ai recommends Max for coding tasks where accuracy matters more than speed.

Is GLM-5.2 available outside the Coding Plan?

Not yet. The standalone API is still rolling out. Currently, access requires a GLM Coding Plan subscription at the Lite, Pro, or Max tier.

Sources

Z.AI Developer Documentation - accessed June 15, 2026
Z.AI Claude Code Setup Guide - accessed June 15, 2026
GLM-5: From Vibe Coding to Agentic Engineering - Z.ai official blog
Z.ai Launches GLM-5.2 With a Usable 1M-Token Context - MarkTechPost, June 14, 2026
GLM-5.2 Complete Guide - AIMadeTools
GLM Coding Plan Pricing Guide - CodingPlan
OpenRouter GLM-5 - OpenRouter

Official Sources

What GLM-5.2 brings to the table

GLM Coding Plan pricing