Claude Code's Extended Thinking Is a Summary - What That Means for You

A blog post by Patrick McCanna hit the Hacker News front page yesterday with a discovery that surprised some developers: Claude Code's "extended thinking" output isn't the raw chain-of-thought reasoning. It's a summary.

With 140 points and 108 comments, the discussion quickly moved from "scandal" to "wait, this is documented" to "but what does it actually mean for my workflow?"

What McCanna Found

McCanna examined session logs stored on disk over a weekend and found only encrypted signatures (around 600 characters) with no readable text. Digging into Anthropic's documentation, he discovered that:

Claude encrypts its actual reasoning into signatures that Anthropic controls
Users' machines don't receive the decryption key
The API returns a summary of reasoning, not the reasoning itself
Full thinking output requires an enterprise agreement

His analogy: "This is like saving a jpeg as a .bmp and then editing the .bmp and presenting it as a jpeg. The conversion produces data loss."

(A few HN commenters pointed out he got the lossy format backwards - JPEG is the lossy one, BMP is lossless - but the point stands.)

What HN Is Saying

The discussion breaks into several threads:

"This Isn't News"

The most common response: this is documented behavior, not a hidden secret.

This is not just Anthropic. Almost all big AI companies, including OpenAI and Google, hide their model's actual reasoning. This is because revealing the raw reasoning exposes exactly how the AI processes information.

Another commenter noted:

Yes hasn't this been around since Opus 4.6? I very much recall this change happening around January or February, and it was very explicitly to prevent distillation.

The Anti-Distillation Argument

Several commenters explained why AI labs hide raw thinking tokens: competitors can train on exposed chain-of-thought to replicate results.

If you have the full outputs, it might make it easier for competitors to distil the model or reverse engineer the full process.

From another angle:

By hiding the reasoning tokens, it makes it harder to do this. You can still try to distill the models, but you can't distill reasoning itself as well.

There's even an open acknowledgment that this already happens with whatever output is available - someone shared a Hugging Face model fine-tuned on Opus "thinking" tokens.

The Practical Impact

For working developers, the key question is: does this matter for your workflow?

One perspective:

I've never found the entire reasoning chain that particularly useful for my work. For me having a summary is honestly better from a context management perspective.

A critical detail from the docs: the summary doesn't go into the context window. The full CoT does. The summary is for human consumption only.

The summary doesn't go into the context, it's for human consumption. The CoT itself goes into the context.

And crucially: you're billed for the full thinking tokens, not the summary.

The Workaround

If you want the actual chain-of-thought reasoning without encrypted summaries, one commenter pointed out you can go old-school:

If you go back to the old school from 2 years ago and provide explicit CoT prompts, you get the full thinking prompts back again! So you disable thinking altogether, and instead make thinking part of the regular prompt.

Example prompt structure:

Before providing your answer, think step by step. For example:

The user is asking me to...
I need to think about the blah blah. First, I should foo the bar, and then...

Answer: <put your final answer here>

This bypasses the summarization entirely - but you lose whatever optimization Anthropic has built into their thinking mode.

The Philosophy Question

Some comments went deeper on what "authentic thinking" even means for an LLM:

The full thinking logs are also a summary of a thinking process presumably consistent with one necessary to generate the provided answer. Nobody really understands how LLMs think.

And:

AIUI it's fairly well established that the models can be saying one thing and "really" thinking another anyhow.

Research has shown that the text a model produces in its "reasoning" may not accurately reflect the actual activation patterns happening internally. So even "raw" thinking tokens might not be the ground truth.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

Codex CLI Needs Resource Budgets, Not Just Token Budgets

Jun 22, 2026 • 8 min read

Codex Logging Bug Can Write Terabytes to Your SSD

Jun 22, 2026 • 5 min read

Deno Desktop Lets You Build Native Apps with TypeScript

Jun 22, 2026 • 6 min read

Fugu Ultra's Frontier Performance Claim, Explained Without the Hype

Jun 22, 2026 • 11 min read

What This Means for Your Workflow

If you're debugging agent behavior: The summary should be sufficient for most prompt engineering. You see the key decision points. But if you need to trace exactly why a model made a specific choice, you're working with compressed information.

If you're measuring performance drift: McCanna's original concern was tracking changes over time. With summarized output, you're comparing summaries - which may vary even if the underlying reasoning stays consistent.

If you're worried about costs: You're already paying for the full reasoning. The summary is a presentation convenience, not a cost reduction.

If you care about distillation protection: This matters more for AI labs than individual developers. But it does mean you can't easily capture and reuse Claude's exact reasoning patterns for your own fine-tuning.

The Bigger Picture

This situation reveals a tension in the AI tooling space: developers want transparency for debugging and understanding, but AI labs have commercial reasons to obscure how their models work.

Anthropic's approach is documented and defensible - you get enough to debug, they protect enough to prevent easy replication. Whether that tradeoff works for you depends on your use case.

For most Claude Code users, the summary is fine. For researchers trying to understand model behavior at a deep level, you'll need to look elsewhere - or pay for enterprise access.

What McCanna Found