AgentMemory: Persistent Context That Cuts AI Coding Agent Costs by 92%

AI coding agents like Claude Code, Cursor, and Gemini CLI share a fundamental design constraint: each session starts fresh. Engineers who spend 20 minutes explaining a codebase's architecture, past decisions, or preferred patterns to their agent lose that context the moment the session ends. The next session starts over from zero.

rohitg00/agentmemory picked up 400 GitHub stars on May 9, 2026, landing it on the daily trending list with a total of 3.1k stars and 319 forks. The repo's v0.9.4 shipped on April 29, 2026 - just ten days before this writing - and appears to have unlocked a burst of attention from developers frustrated with the re-explanation loop. The pitch is direct: a persistent memory layer benchmarked at 95.2% retrieval accuracy on LongMemEval-S, with a claimed 92% token reduction compared to stuffing full context into every prompt.

Those numbers are specific and independently checkable - a good sign for a project still under 4k stars.

What AgentMemory Does

AgentMemory runs as a local service that intercepts and organizes context across agent sessions. It uses a 4-tier memory architecture:

Working memory - the active session's recent tool calls and code changes
Episodic summaries - compressed records of past sessions
Semantic facts - extracted knowledge about the codebase (patterns, conventions, decisions)
Procedural workflows - reusable sequences the agent has learned

Retrieval uses a hybrid approach: BM25 keyword matching, vector embeddings, and knowledge graph traversal operate together so that both exact matches and semantically similar queries surface relevant context. The combination is designed to avoid the common failure modes of pure keyword search (misses paraphrase) and pure vector search (misses exact identifiers).

The system exposes 51 MCP tools and 12 auto-capture hooks. MCP tools let any MCP-compatible agent query memory on demand. Auto-capture hooks integrate into Claude Code's hook system to capture context automatically as the agent works - no manual tagging required.

A real-time viewer runs at localhost:3113, showing what is stored and what is being retrieved during active sessions. For debugging unexpected memory hits or gaps, that viewer is the fastest diagnostic tool.

The entire implementation runs on SQLite plus the project's own iii-engine, with zero external service dependencies. No Pinecone account, no separate vector database, no additional API keys beyond what the agents already use.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

Agent-Native Backends Are the Next AI Coding Bottleneck

May 8, 2026 • 8 min read

Matt Pocock's .claude Skills Pack Is Trending for Good Reason

May 8, 2026 • 6 min read

6 Launches in One Day: The DD Empire Expansion

May 7, 2026 • 6 min read

DevDigest OS: The Thesis Behind Treating an Empire as One Operating System

May 7, 2026 • 9 min read

How to Install and Try It

The fastest path is the npx route:

npx @agentmemory/agentmemory

This starts the memory service locally. For Claude Code specifically, the plugin marketplace route gives deeper integration:

/plugin marketplace add rohitg00/agentmemory

For Cursor or any MCP-compatible agent, add the server to your MCP config:

{
  "mcpServers": {
    "agentmemory": {
      "command": "npx",
      "args": ["@agentmemory/agentmemory"]
    }
  }
}

Once running, the 12 auto-capture hooks begin logging tool activity. The agent starts retrieving relevant past context automatically at the beginning of each session rather than requiring the engineer to re-explain setup. The project ships with 827 passing tests and is Apache-2.0 licensed.

Who Should Use AgentMemory

Teams running repeated agent sessions on the same codebase. If a developer uses Claude Code daily on the same project, re-explaining context each morning is the biggest hidden cost in the workflow. AgentMemory automates that carry-forward.

Engineers optimizing for token spend. The 92% token reduction claim translates to a concrete number the README provides: roughly $10/year versus $500 or more for context-heavy workflows. For developers running agents across multiple projects or for extended periods, that math compounds quickly.

Multi-agent setups. The cross-agent support via MCP and REST API means that when a Claude Code agent hands off a task to a subagent or a separate Cursor session, shared memory lets them start from the same knowledge base. This is the kind of coordination problem that gets expensive fast without a shared store - each agent re-building context independently defeats the purpose of parallelism.

Agent framework builders. The 51 MCP tools give framework developers fine-grained control over what memory gets stored, retrieved, and pruned. Projects building on top of the Claude Agent SDK or other orchestration layers can treat AgentMemory as a managed memory service rather than reinventing their own persistence layer.

Connection to the DevDigest Ecosystem

AgentMemory's hook system connects directly to patterns covered at hooks.developersdigest.tech. Claude Code hooks intercept tool calls, post-tool completions, and session lifecycle events. AgentMemory's 12 auto-capture hooks work within that same model - they register handlers that fire on tool execution, automatically writing context to the memory store without any explicit developer action. If you have already built custom hooks for Claude Code, AgentMemory's architecture will feel immediately familiar.

The 51 MCP tools also fit into the broader MCP server landscape documented at mcp.developersdigest.tech. AgentMemory does not replace domain-specific MCP servers like browser control or database access - it adds a memory layer that makes those tools more useful across sessions by preserving what was learned from prior uses. A browser automation agent that spent a session mapping a site's navigation can share that map with the next session rather than rediscovering it.

For engineers managing multi-agent workflows or building on the Claude Agent SDK, AgentMemory is the kind of infrastructure that sits between the agent runtime and the codebase, handling state persistence so application code does not have to.

Honest Assessment

The benchmark numbers are the project's strongest asset. 95.2% retrieval accuracy on LongMemEval-S is a published evaluation standard, not a self-defined metric, and 827 passing tests is meaningful coverage for a TypeScript library at this scope. Zero external dependencies keeps the operational surface area small - there are no third-party services to fail, spike in cost, or require separate monitoring.

The limitations worth flagging: 3.1k stars is still early for production infrastructure. The v0.9.x version series signals active development but also pre-1.0 instability - breaking changes are more likely than in a stable release. The benchmark numbers come from the project's own documentation; independent reproduction has not appeared in public evaluation yet.

The hook-based auto-capture is powerful but also means the memory store grows with every session. The README does not yet describe pruning policies or storage quotas in detail. For long-running projects with heavy agent usage, that accumulation could become a maintenance concern worth monitoring before it becomes a problem.

Despite those caveats, this is worth evaluating seriously if you spend meaningful time re-explaining context to coding agents. Run it locally for a week, check the real-time viewer at localhost:3113 to verify that retrieval is surfacing the right facts, and measure whether token spend dropped. The zero-dependency setup makes that evaluation low-friction.

References

GitHub repository: https://github.com/rohitg00/agentmemory
npm package: https://www.npmjs.com/package/@agentmemory/agentmemory
LongMemEval benchmark paper: https://arxiv.org/abs/2410.10813
Claude Code hooks documentation: https://docs.anthropic.com/en/docs/claude-code/hooks
DevDigest hooks coverage: https://hooks.developersdigest.tech
DevDigest MCP server directory: https://mcp.developersdigest.tech

AgentMemory: Persistent Cross-Session Memory for Claude Code and 16 Other AI Agents

Ruflo: Multi-Agent Orchestration for Claude Code That Actually Scales

agentmemory: Persistent Memory for Claude Code and AI Agents

Why AgentMemory Is Trending Today

What AgentMemory Does

Agent-Native Backends Are the Next AI Coding Bottleneck

Matt Pocock's .claude Skills Pack Is Trending for Good Reason

6 Launches in One Day: The DD Empire Expansion

DevDigest OS: The Thesis Behind Treating an Empire as One Operating System

How to Install and Try It

Who Should Use AgentMemory

Connection to the DevDigest Ecosystem

Honest Assessment

References

Related Tools

Qwen3-Coder

Augment Code

Kimi Code

Claude Opus 4.7

Apps from Developers Digest

Overnight Agents

Agent Hub

Agent Eval Bench Plus

Related Guides

Subagent Persistent Memory - Claude Code

Building Your First MCP Server

Skill in Subagent - Claude Code

Related Posts

12-Factor Agents: The Production Principles Every AI Builder Should Know

12-Factor Agents: A Production Playbook for LLM Software

12-Factor Agents: The Production Blueprint for LLM-Powered Software

agentmemory: Persistent Memory for Claude Code and AI Agents

AgentMemory: Persistent Cross-Session Memory for Claude Code and 16 Other AI Agents

Ruflo: The Claude Code Plugin for Coordinating 100+ Specialized AI Agents

Get Smarter About AI Dev

AgentMemory: Persistent Cross-Session Memory for Claude Code and 16 Other AI Agents

Ruflo: Multi-Agent Orchestration for Claude Code That Actually Scales

agentmemory: Persistent Memory for Claude Code and AI Agents

Why AgentMemory Is Trending Today

What AgentMemory Does

Agent-Native Backends Are the Next AI Coding Bottleneck

Matt Pocock's .claude Skills Pack Is Trending for Good Reason

6 Launches in One Day: The DD Empire Expansion

DevDigest OS: The Thesis Behind Treating an Empire as One Operating System

How to Install and Try It

Who Should Use AgentMemory

Connection to the DevDigest Ecosystem

Honest Assessment

References

Related Tools

Qwen3-Coder

Augment Code

Kimi Code

Claude Opus 4.7

Apps from Developers Digest

Overnight Agents

Agent Hub

Agent Eval Bench Plus

Related Guides

Subagent Persistent Memory - Claude Code

Building Your First MCP Server

Skill in Subagent - Claude Code

Related Posts

12-Factor Agents: The Production Principles Every AI Builder Should Know

12-Factor Agents: A Production Playbook for LLM Software

12-Factor Agents: The Production Blueprint for LLM-Powered Software

agentmemory: Persistent Memory for Claude Code and AI Agents

AgentMemory: Persistent Cross-Session Memory for Claude Code and 16 Other AI Agents

Ruflo: The Claude Code Plugin for Coordinating 100+ Specialized AI Agents

Get Smarter About AI Dev