TL;DR
Anthropic says persistent file-based memory improved Fable 5 three times more than it improved Opus 4.8. Here is the full memory tool setup - handlers, security, and context editing included.
Read next
Ultracode is two documented things: a prompt keyword that turns one task into a dynamic workflow, and an /effort setting that pairs xhigh reasoning with automatic orchestration. Here is exactly what the docs say.
8 min readFable 5 posts an 80.3% SWE-Bench Pro score and costs 2x Opus 4.8 - here is the task-profile scoring guide that tells you when the premium pays off.
7 min readClaude agents vs skills, untangled: agents are workers with their own context window, skills are instructions loaded on demand. Here is the decision table.
8 min readLast updated: June 11, 2026
Buried in the Claude Fable 5 launch post is one of the most actionable engineering claims of the release. In Anthropic's Slay the Spire testing, "giving it access to persistent file-based memory improved its performance three times more than for Opus 4.8," and with memory enabled, Fable reached the game's final act three times more often. The launch post also states that Fable 5 "stays focused across millions of tokens in long-running tasks and improves its outputs using its own notes."
Read that as a deployment instruction, not a benchmark flex. If the model was trained to exploit a memory surface this aggressively, running Fable 5 without one leaves a disproportionate amount of capability on the table. The good news: the memory tool is GA, listed as a supported launch feature for Fable 5, and takes about an hour to wire up. This tutorial walks through the whole thing.
The usual caveat: a deck-building game is a vendor benchmark, not your CI pipeline. But the setup cost is low enough that testing it on your own workload is the rational move.
The memory tool is a client-side tool: Anthropic defines the schema and trains Claude on it, but your application executes every operation. Claude reads and writes files under a virtual /memories directory, and you decide what backs it - local disk, a database, S3, encrypted storage. Per the tool reference, the tool type is memory_20250818, it is generally available, and no beta header is required for the tool itself.
Your handler needs to support six commands:
| Command | What it does |
|---|---|
view | List a directory (2 levels deep) or show file contents with line numbers |
create | Create a new file from file_text |
str_replace | Replace a unique string in a file |
insert | Insert text at a specific line |
delete | Delete a file or directory (recursive) |
rename | Move or rename a file or directory |
When the tool is enabled, the API automatically injects a memory protocol into the system prompt. The documented instruction opens with "ALWAYS VIEW YOUR MEMORY DIRECTORY BEFORE DOING ANYTHING ELSE" and tells the model to "ASSUME INTERRUPTION: Your context window might be reset at any moment." The behavior is baked in - you do not prompt for it.
The minimal request adds one entry to the tools array. Two Fable 5 specifics worth knowing before you copy an older snippet: thinking is always on, so omit the thinking parameter entirely (the Fable 5 introduction docs list an explicit disabled as unsupported, and sending one gets rejected with a 400), and depth is controlled with output_config.effort instead.
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic();
const message = await anthropic.messages.create({
model: "claude-fable-5",
max_tokens: 16000,
messages: [
{
role: "user",
content: "Pick up the migration where you left off.",
},
],
tools: [{ type: "memory_20250818", name: "memory" }],
});
Send that and the first block back will almost always be a tool_use asking to view /memories - which brings us to the part you have to build.
You can hand-roll the six commands against the documented return strings, but the SDKs ship helpers that handle the tool interface for you. In TypeScript, betaMemoryTool wraps a handlers object and plugs straight into the tool runner:
import Anthropic from "@anthropic-ai/sdk";
import {
betaMemoryTool,
type MemoryToolHandlers,
} from "@anthropic-ai/sdk/helpers/beta/memory";
import fs from "node:fs/promises";
import path from "node:path";
const ROOT = path.resolve("./memory-store");
// Path guard: every command path must stay inside /memories
function resolveSafe(p: string): string {
if (!p.startsWith("/memories")) {
throw new Error(`Path must start with /memories: ${p}`);
}
const full = path.resolve(ROOT, "." + p.slice("/memories".length));
if (full !== ROOT && !full.startsWith(ROOT + path.sep)) {
throw new Error("Path traversal blocked");
}
return full;
}
const handlers: MemoryToolHandlers = {
async view({ path: p }) {
const target = resolveSafe(p);
const text = await fs.readFile(target, "utf8");
const numbered = text
.split("\n")
.map((line, i) => `${String(i + 1).padStart(6)}\t${line}`)
.join("\n");
return `Here's the content of ${p} with line numbers:\n${numbered}`;
},
async create({ path: p, file_text }) {
const target = resolveSafe(p);
await fs.mkdir(path.dirname(target), { recursive: true });
await fs.writeFile(target, file_text, { flag: "wx" });
return `File created successfully at: ${p}`;
},
// str_replace, insert, delete, rename follow the same shape -
// return the exact success/error strings from the memory tool docs
async str_replace(command) { /* ... */ },
async insert(command) { /* ... */ },
async delete(command) { /* ... */ },
async rename(command) { /* ... */ },
};
const client = new Anthropic();
const runner = client.beta.messages.toolRunner({
model: "claude-fable-5",
max_tokens: 16000,
tools: [betaMemoryTool(handlers)],
messages: [{ role: "user", content: "Continue the migration project." }],
});
for await (const message of runner) {
console.log(message);
}
The docs specify exact return strings for every command and error case (for example, str_replace must reject non-unique matches and report the line numbers of each occurrence). Claude is trained against those shapes, so match them rather than improvising. Python users get an even shorter path: the SDK ships a ready-made BetaLocalFilesystemMemoryTool in anthropic.tools, shown end-to-end in the official memory example, and a BetaAbstractMemoryTool base class when you want a custom backend.
The path guard is not optional. The docs carry an explicit warning that your implementation must validate every path against traversal attacks - reject anything that does not resolve inside your memory root, including encoded sequences like %2e%2e%2f. Anything that can influence the model's input can influence those paths.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Jun 11, 2026 • 10 min read
Jun 11, 2026 • 10 min read
Jun 11, 2026 • 10 min read
Jun 11, 2026 • 10 min read
Memory solves cross-session persistence. It does nothing about a single session filling its context window with stale tool results. That is context editing's job, and the two are designed to work together: context editing clears old tool results server-side, and when context approaches the clearing threshold, Claude receives an automatic warning to preserve important information - so anything worth keeping gets written to memory files before the results are cleared.
Context editing is beta (header context-management-2025-06-27). The one configuration detail that matters for this pairing: exclude the memory tool from clearing, so memory reads stay in context while bulky one-shot results get dropped.
const response = await client.beta.messages.create({
model: "claude-fable-5",
max_tokens: 16000,
betas: ["context-management-2025-06-27"],
tools: [{ type: "memory_20250818", name: "memory" }],
context_management: {
edits: [
{
type: "clear_tool_uses_20250919",
trigger: { type: "input_tokens", value: 30000 },
keep: { type: "tool_uses", value: 3 },
clear_at_least: { type: "input_tokens", value: 5000 },
exclude_tools: ["memory"],
},
],
},
messages,
});
One honest tradeoff from the docs: clearing tool results invalidates the cached prompt prefix at the clearing point, so set clear_at_least high enough that the cache rewrite pays for itself. And remember the cost asymmetry on Fable 5 - every memory file the model reads re-enters context as input tokens at $10 per million, double Opus 4.8. Small, curated memory files are directly cheaper. The memory tool docs also note it pairs with server-side compaction, where memory persists the important state across compaction boundaries.
The memory tool docs include a pattern for multi-session software projects that is worth implementing verbatim, because it is the structured version of what the Slay the Spire harness did:
The docs' key principle: work one feature at a time, and only mark a feature complete after end-to-end verification - otherwise the progress log stops being trustworthy. If Claude's memory folder gets messy over time, the docs suggest adding an instruction to keep it "up-to-date, coherent and organized" and to delete files that are no longer relevant.
A files-in-a-directory model is deliberately primitive, and that is its strength - it composes with everything else you might be running. If you are weighing it against vector stores, summarization layers, or graph memory, our agent memory patterns breakdown maps the option space. Cloudflare reached a strikingly similar conclusion with its SQLite-per-agent memory primitive, and the filesystem-as-agent-state idea gets its fullest treatment in AgentFS. The memory tool is the lowest-friction entry point of the bunch because the model is already trained on the protocol.
Whether Fable 5 is the right model for your long-running agent is a separate question - the Fable 5 vs Opus 4.8 decision guide covers the routing math, and the Fable 5 migration guide lists what breaks coming from Opus. But memory plus Fable 5 is the combination Anthropic measured.
No. The tool type memory_20250818 is listed as GA in Anthropic's tool reference, and the basic request needs no beta header. You only add context-management-2025-06-27 if you pair it with context editing; the SDK toolRunner helpers live in the beta namespaces.
On your infrastructure. The memory tool is client-side: Claude emits commands against a virtual /memories path, and your handler maps them to whatever backend you choose - local filesystem, database, or object storage. Anthropic never hosts the files, which means you own retention, encryption, and per-user isolation.
Yes. The official docs' own examples use claude-opus-4-8, and the tool predates Fable 5. What changed with Fable 5 is the payoff: Anthropic's launch post claims persistent file-based memory improved Fable 5's Slay the Spire performance three times more than it improved Opus 4.8.
They operate at different layers. The memory tool persists files across sessions and runs client-side. Context editing clears stale tool results server-side within a session when context grows past a threshold. Compaction summarizes the whole conversation server-side as it approaches the context window limit. Anthropic's docs recommend combining them for long-running agents: compaction or editing keeps the live context lean, and memory makes sure nothing critical is lost in the process.
Use the prompting lever the docs provide: instruct Claude to keep the memory folder coherent, rename or delete stale files, and avoid creating new files unless necessary. You can also scope content ("only write down information relevant to this project") and cap file sizes in your handler, paginating large reads.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Gives AI agents access to 250+ external tools (GitHub, Slack, Gmail, databases) with managed OAuth. Handles the auth and...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolThe TypeScript toolkit for building AI apps. Unified API across OpenAI, Anthropic, Google. Streaming, tool calling, stru...
View ToolAnthropic's Python SDK for building production agent systems. Tool use, guardrails, agent handoffs, and orchestration. R...
View ToolGive your agents a filesystem that branches like git. Crash-safe by default.
View AppCompare AI coding agents on reproducible tasks with scored, shareable runs.
View AppDefine AI-assisted business automations without locking the workflow to one vendor.
View AppExecute shell commands with persistent working directory in project bounds.
Claude CodeDefine custom subagent types within your project's memory layer.
Claude CodeAuto-memory that persists across multiple subagent invocations.
Claude CodeClaude agents vs skills, untangled: agents are workers with their own context window, skills are instructions loaded on...
Claude Code dynamic workflows turn orchestration into a JavaScript script that runs up to 1,000 agents per run - here is...
Task budgets give Claude a token countdown for the whole agentic loop, so the model paces itself instead of discovering...
Ultracode is two documented things: a prompt keyword that turns one task into a dynamic workflow, and an /effort setting...
Twelve documented Claude Fable 5 use patterns - agent orchestration, overnight runs, 1M-context refactors, effort tuning...
Fable 5 posts an 80.3% SWE-Bench Pro score and costs 2x Opus 4.8 - here is the task-profile scoring guide that tells you...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.