
TL;DR
A developer discovered that Claude Code's thinking output is summarized, not the raw reasoning. Here's what Anthropic's docs actually say - and why it matters.
A blog post by Patrick McCanna hit the Hacker News front page yesterday with a discovery that surprised some developers: Claude Code's "extended thinking" output isn't the raw chain-of-thought reasoning. It's a summary.
With 140 points and 108 comments, the discussion quickly moved from "scandal" to "wait, this is documented" to "but what does it actually mean for my workflow?"
McCanna examined session logs stored on disk over a weekend and found only encrypted signatures (around 600 characters) with no readable text. Digging into Anthropic's documentation, he discovered that:
His analogy: "This is like saving a jpeg as a .bmp and then editing the .bmp and presenting it as a jpeg. The conversion produces data loss."
(A few HN commenters pointed out he got the lossy format backwards - JPEG is the lossy one, BMP is lossless - but the point stands.)
The discussion breaks into several threads:
The most common response: this is documented behavior, not a hidden secret.
This is not just Anthropic. Almost all big AI companies, including OpenAI and Google, hide their model's actual reasoning. This is because revealing the raw reasoning exposes exactly how the AI processes information.
Another commenter noted:
Yes hasn't this been around since Opus 4.6? I very much recall this change happening around January or February, and it was very explicitly to prevent distillation.
Several commenters explained why AI labs hide raw thinking tokens: competitors can train on exposed chain-of-thought to replicate results.
If you have the full outputs, it might make it easier for competitors to distil the model or reverse engineer the full process.
From another angle:
By hiding the reasoning tokens, it makes it harder to do this. You can still try to distill the models, but you can't distill reasoning itself as well.
There's even an open acknowledgment that this already happens with whatever output is available - someone shared a Hugging Face model fine-tuned on Opus "thinking" tokens.
For working developers, the key question is: does this matter for your workflow?
One perspective:
I've never found the entire reasoning chain that particularly useful for my work. For me having a summary is honestly better from a context management perspective.
A critical detail from the docs: the summary doesn't go into the context window. The full CoT does. The summary is for human consumption only.
The summary doesn't go into the context, it's for human consumption. The CoT itself goes into the context.
And crucially: you're billed for the full thinking tokens, not the summary.
If you want the actual chain-of-thought reasoning without encrypted summaries, one commenter pointed out you can go old-school:
If you go back to the old school from 2 years ago and provide explicit CoT prompts, you get the full thinking prompts back again! So you disable thinking altogether, and instead make thinking part of the regular prompt.
Example prompt structure:
Before providing your answer, think step by step. For example:
The user is asking me to...
I need to think about the blah blah. First, I should foo the bar, and then...
Answer: <put your final answer here>
This bypasses the summarization entirely - but you lose whatever optimization Anthropic has built into their thinking mode.
Some comments went deeper on what "authentic thinking" even means for an LLM:
The full thinking logs are also a summary of a thinking process presumably consistent with one necessary to generate the provided answer. Nobody really understands how LLMs think.
And:
AIUI it's fairly well established that the models can be saying one thing and "really" thinking another anyhow.
Research has shown that the text a model produces in its "reasoning" may not accurately reflect the actual activation patterns happening internally. So even "raw" thinking tokens might not be the ground truth.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 22, 2026 • 8 min read
Jun 22, 2026 • 5 min read
Jun 22, 2026 • 6 min read
Jun 22, 2026 • 11 min read
If you're debugging agent behavior: The summary should be sufficient for most prompt engineering. You see the key decision points. But if you need to trace exactly why a model made a specific choice, you're working with compressed information.
If you're measuring performance drift: McCanna's original concern was tracking changes over time. With summarized output, you're comparing summaries - which may vary even if the underlying reasoning stays consistent.
If you're worried about costs: You're already paying for the full reasoning. The summary is a presentation convenience, not a cost reduction.
If you care about distillation protection: This matters more for AI labs than individual developers. But it does mean you can't easily capture and reuse Claude's exact reasoning patterns for your own fine-tuning.
This situation reveals a tension in the AI tooling space: developers want transparency for debugging and understanding, but AI labs have commercial reasons to obscure how their models work.
Anthropic's approach is documented and defensible - you get enough to debug, they protect enough to prevent easy replication. Whether that tradeoff works for you depends on your use case.
For most Claude Code users, the summary is fine. For researchers trying to understand model behavior at a deep level, you'll need to look elsewhere - or pay for enterprise access.
Read next
Anthropic's docs say the tokenizer introduced with Opus 4.7 can use up to 35% more tokens for the same text. Here is what that does to per-request cost, max_tokens, and cross-model comparisons.
8 min readClaude Code is Anthropic's terminal-based AI agent that ships code autonomously. Complete guide: install, CLAUDE.md memory, MCP, sub-agents, pricing, and workflows.
6 min readUltracode is two documented things: a prompt keyword that turns one task into a dynamic workflow, and an /effort setting that pairs xhigh reasoning with automatic orchestration. Here is exactly what the docs say.
8 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolInteractive TUI dashboard that shows exactly where your Claude Code and Cursor tokens are going, in real time.
View ToolAnthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...
View ToolAnthropic's Python SDK for building production agent systems. Tool use, guardrails, agent handoffs, and orchestration. R...
View ToolEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppUnlock pro skills and share private collections with your team.
View AppToggle with Alt+T. Claude reasons through complex problems before responding.
Claude CodeA concrete step-by-step guide to moving your development workflow from Cursor to Claude Code - settings, rules, keybindings, and the habits that transfer.
Getting StartedPath-specific rules that only load for matching files.
Claude Code
Open Design: Open-Source n8n App That Turns Any Website into a Brand Kit, Design System, HTML + Images The video introduces Open Design, an MIT-licensed full-stack template that combines AI and n8n a...

Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...

Composio: Connect AI Agents to 1,000+ Apps via CLI (Gmail, Google Docs/Sheets, Hacker News Workflows) Check out Composio here: http://dashboard.composio.dev/?utm_source=Youtube&utm_channel=0426&utm_...

A YC W25 startup open-sources CADAM, a browser-based tool that converts natural language to parametric OpenSCAD models....

Alex Ellis shares real production experience running local LLMs: $12k hardware investment, 2-3 month ROI, and why treati...

Switzerland's fully open foundation model promises transparent training data and EU compliance. The HN crowd has questio...

A bug in OpenAI's Codex CLI writes excessive trace logs to SQLite, potentially consuming 640TB/year of SSD writes. The i...

Deno 2.9 ships a desktop app framework that compiles TypeScript projects into native binaries with WebView or bundled Ch...

Dan Abramov's explainer on ATProto architecture is making the rounds. The core insight: Bluesky's protocol separates hos...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.