The 98% Context Reduction Pattern

Most agent systems waste context by default.

They call a tool. The tool returns a large JSON blob. The model reads the blob, chooses the next tool, gets another large blob, and repeats. After a few steps, the context window is full of intermediate data the agent no longer needs.

The result is familiar: slower runs, higher costs, worse reasoning, and failures that look mysterious until you inspect the transcript.

The fix is simple but underused: keep intermediate state outside the model context.

Anthropic's recent engineering work around code execution with MCP points at this pattern. Instead of making the model directly inspect every row, page, or event, give the agent an execution environment where it can write small programs, process data locally, and return only the answer, the evidence, and the receipt.

For the foundation, read progressive disclosure in Claude Code and the context engineering guide. This post is the implementation pattern.

The Bad Loop

A naive agent loop looks like this:

agent -> list all customers
tool -> returns 10,000 rows
agent -> filter for accounts with failed invoices
tool -> returns 1,200 rows
agent -> inspect invoice events
tool -> returns 30,000 events
agent -> summarize failures

The model context becomes a data warehouse. That is not what a language model is good at.

Even if the context window is large enough, the reasoning quality suffers. The model has to search through raw data, remember which parts matter, and avoid being distracted by irrelevant fields.

Large context is useful. It is not a substitute for data processing.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

Agent Swarms Need Receipts

May 2, 2026 • 8 min read

Agentic Search Works Best When It Writes Queries, Not Answers

May 2, 2026 • 8 min read

Approval Fatigue Is an Agent Security Bug

May 2, 2026 • 7 min read

Claude Code Agent Teams, Subagents, and MCP: The 2026 Playbook

May 2, 2026 • 9 min read

The Better Loop

A better loop gives the agent a place to compute:

agent -> write a script that queries customers, joins invoices, filters failures, and outputs a compact report
execution environment -> runs the script
tool -> returns summary, counts, source IDs, and errors
agent -> reasons from the report

The model does not need every row. It needs the result and enough evidence to trust it.

The difference is not cosmetic. It changes the shape of the whole agent system:

raw data stays in the execution environment
intermediate files stay on disk
logs stay in traces
the model sees compact outputs
humans get receipts they can audit

That is the 98% context reduction pattern.

What Belongs Outside Context

Move these out of the model context whenever possible:

full database query results
full API responses
raw HTML pages
large logs
dependency trees
generated intermediate files
repeated tool schemas
long test output after the first failure

Keep these in context:

the user goal
relevant constraints
a compact summary of findings
source IDs or links
the current plan
the final diff or artifact
the next decision that needs reasoning

The model should reason. Code should crunch.

Filesystem State Is Agent Memory

The most underrated memory primitive is still the filesystem.

If an agent processes 50 files, it does not need to paste the full contents of all 50 into context. It can write:

.agent-work/
  findings.json
  failing-tests.txt
  candidate-files.txt
  summary.md

Then it can read the compact summary when needed. The raw evidence remains available without living in the prompt forever.

This is why local coding agents feel powerful. They can use files as durable scratch space. The context window becomes the active working set, not the entire workspace.

The Receipt Format

A good reduced-context tool response should include:

{
  "summary": "Found 18 failed invoices across 7 customers.",
  "counts": {
    "customersScanned": 10421,
    "failedInvoices": 18,
    "affectedCustomers": 7
  },
  "evidence": [
    "customer_123 invoice inv_456",
    "customer_789 invoice inv_999"
  ],
  "filesWritten": [
    ".agent-work/failed-invoices.csv",
    ".agent-work/invoice-summary.md"
  ],
  "nextSuggestedAction": "Inspect payment provider webhook logs for these invoice IDs."
}

The model can act on that. The human can audit it. The raw data is still available if deeper inspection is needed.

How to Apply This Today

You do not need a new framework.

For Claude Code, ask it to write analysis scripts and keep raw outputs in files. For MCP servers, expose workflow tools that process data server-side and return receipts. For custom agent apps, add a workspace directory and persist intermediate state between steps.

The architecture is boring:

Let the agent create a script or query.
Run it in a scoped environment.
Save raw outputs to files.
Return a compact summary.
Keep links back to evidence.

That boring pattern is what makes long-running agents cheaper and more reliable.

The Bottom Line

Context is not a trash can.

Efficient agents keep the model focused on decisions and keep intermediate state in the systems built for it: code, files, databases, logs, and traces. The best agent architectures do not ask the model to remember everything. They give it a reliable way to retrieve what matters.

That is how you cut context without cutting capability.

Sources

Anthropic Engineering: Effective context engineering for AI agents
Anthropic Engineering: Code execution with MCP
DevDigest: Progressive Disclosure: How Claude Code Cut Token Usage by 98%
DevDigest: Context Engineering Guide

Progressive Disclosure: How Claude Code Cut Token Usage by 98%

Context Engineering: The Highest-Leverage Skill in AI-Assisted Development

AI Agent Memory Patterns

The Bad Loop

Agent Swarms Need Receipts

Agentic Search Works Best When It Writes Queries, Not Answers

Approval Fatigue Is an Agent Security Bug

Claude Code Agent Teams, Subagents, and MCP: The 2026 Playbook

The Better Loop

What Belongs Outside Context

Filesystem State Is Agent Memory

The Receipt Format

How to Apply This Today

The Bottom Line

Sources

Comments

Related Tools

Composio

MCP Inspector

MCP CLI

MCP Hub

Apps from Developers Digest

MCP Directory

Related Guides

Prompt Caching - Claude Code

MCP Tool Search - Claude Code

Claude Code Setup Guide

Related Posts

Progressive Disclosure: How Claude Code Cut Token Usage by 98%

Context Engineering: The Highest-Leverage Skill in AI-Assisted Development

AI Agent Memory Patterns

DD Traces: Beautiful Local OpenTelemetry for AI Development

One Tool Beats Ten Endpoints

Long-Running Agents Need Harnesses, Not Hope

Get Smarter About AI Dev

Progressive Disclosure: How Claude Code Cut Token Usage by 98%

Context Engineering: The Highest-Leverage Skill in AI-Assisted Development

AI Agent Memory Patterns

The Bad Loop

Agent Swarms Need Receipts

Agentic Search Works Best When It Writes Queries, Not Answers

Approval Fatigue Is an Agent Security Bug

Claude Code Agent Teams, Subagents, and MCP: The 2026 Playbook

The Better Loop

What Belongs Outside Context

Filesystem State Is Agent Memory

The Receipt Format

How to Apply This Today

The Bottom Line

Sources

Comments

Related Tools

Composio

MCP Inspector

MCP CLI

MCP Hub

Apps from Developers Digest

MCP Directory

Related Guides

Prompt Caching - Claude Code

MCP Tool Search - Claude Code

Claude Code Setup Guide

Related Posts

Progressive Disclosure: How Claude Code Cut Token Usage by 98%

Context Engineering: The Highest-Leverage Skill in AI-Assisted Development

AI Agent Memory Patterns

DD Traces: Beautiful Local OpenTelemetry for AI Development

One Tool Beats Ten Endpoints

Long-Running Agents Need Harnesses, Not Hope

Get Smarter About AI Dev