
TL;DR
A new paper proposes inverting traditional agent architecture - making the append-only event log the source of truth, not an afterthought. HN debates whether this is novel or just CQRS with extra steps.
A short but provocative paper appeared on arXiv this week: "The Log is the Agent" by Yohei Nakajima (creator of BabyAGI). The core claim is simple but has significant implications for how we build AI agent systems.
Most agent frameworks treat logging as an afterthought - something you bolt on for debugging and compliance. Nakajima argues we should flip this: make the append-only event log the source of truth, and derive all agent state from that log.
Traditional agent architectures look like this:
LLM → State → Tools → World
↓
Logs (optional audit trail)
The "log is the agent" architecture inverts this:
Event Log (source of truth)
↓
Graph State (deterministic projection)
↓
Behaviors (react to graph changes, emit new events)
The key properties this enables:
The paper introduces ActiveGraph, a runtime that implements this pattern. The graph is never mutated directly - behaviors react to graph changes and emit new events, which get appended to the log. The working graph is just a projection of the log state.
The Hacker News discussion is notably technical, with several experienced developers recognizing the pattern immediately.
"The AI folks have discovered CQRS?": Multiple commenters pointed out that this is essentially event sourcing and CQRS, patterns that have been standard in distributed systems for over a decade. One commenter wryly noted that the paper is "presenting common ideas as novel without thinking through existing problems."
Practical implementations: Several developers shared their own agent harnesses that use similar patterns. Lightspeed stores all context-affecting events in an event log, making forking trivial - just set a pointer to another sequence number. Another commenter is building a similar system on Elixir/Ash.
Cost concerns: A valid critique: "wouldn't feeding that log for each request/response iteration get expensive really fast?" This is the elephant in the room. Event sourcing traditionally works well because replaying events is cheap. But LLM calls are expensive - replaying a conversation means paying for all those tokens again. The paper stores model responses in the log to avoid re-generation, but that's replay-as-recording, not true deterministic replay.
Write-ahead logs from databases: One commenter with database experience noted that WAL (write-ahead log) patterns provide a natural interface between speculative agent work and durable world mutations. This connects to broader work on stealing database ideas for AI agents.
Skepticism about the paper itself: Some commenters were unimpressed by the paper's structure - "we discuss without claiming to demonstrate" raised eyebrows. Others noted the author is a VC rather than a researcher, though his BabyAGI work has been influential in the agent space.
Newsletter
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.
From the archive
Jul 5, 2026 • 5 min read
Jul 5, 2026 • 7 min read
Jul 4, 2026 • 8 min read
Jul 4, 2026 • 8 min read
Even if the core insight isn't novel, the paper articulates something important: most agent frameworks get state management wrong.
Here's the problem. You start an agent session. It makes some tool calls. You want to:
With most frameworks, this is surprisingly hard. The session state is scattered across:
If you store raw events from the start - every user message, every assistant response, every tool call and result - you can derive any projection you need. Want to fork? Just point to an earlier sequence number. Want to compact? Generate a summary event and start a new log segment. Want to replay? Feed the events back through your projection logic.
// Event log approach
interface AgentEvent {
id: string;
timestamp: number;
type: 'user_message' | 'assistant_message' | 'tool_call' | 'tool_result' | 'compaction';
payload: unknown;
parentId?: string; // For forking
}
// Current state is always derived
function projectState(events: AgentEvent[]): ConversationState {
return events.reduce((state, event) => {
switch (event.type) {
case 'user_message':
return { ...state, messages: [...state.messages, event.payload] };
case 'compaction':
return { ...state, messages: [event.payload.summary] };
// etc.
}
}, initialState);
}
The paper's insight connects to a broader pattern: AI agents are essentially distributed systems with unreliable components (the LLM), and we should apply distributed systems patterns to them.
The log-centric architecture echoes several database concepts:
As agents get more complex - longer runs, more tools, multi-step planning - these patterns become essential. You can't debug a 2-hour agent run by reading through 500 messages. You need structured replay, causal tracing, and the ability to "what if" from any point.
If you're building agent systems, consider:
The paper's ActiveGraph implementation is available to try, though several commenters noted it's early-stage and the author's website doesn't even have a valid SSL cert.
Whether or not you use ActiveGraph, the log-centric mental model is worth internalizing. As one commenter put it: "This paper points at an idea, but it's really only legible if you have a more developed version of the idea already."
The most common HN response was some variation of "this is just event sourcing." And they're right - the patterns are well-established. The contribution isn't inventing something new; it's applying known patterns to a domain where they're surprisingly underused.
Most agent frameworks are still in the "mutate state directly" paradigm. They store message histories as mutable arrays, compact them in place, and lose fidelity in the process. The log-centric approach is more work upfront but pays dividends in debuggability, reproducibility, and composability.
The AI community has a habit of rediscovering established CS patterns. Sometimes that's frustrating. Sometimes it's necessary - the old patterns need to be re-articulated for a new context. This paper does the latter, even if imperfectly.
Read next
Dan Abramov's explainer on ATProto architecture is making the rounds. The core insight: Bluesky's protocol separates hosting from applications in a way that Mastodon-style federation fundamentally cannot. Here's what that means for developers.
7 min readThe new wrangler deploy --temporary flag creates ephemeral Cloudflare accounts for AI agents. 60-minute deployments, no OAuth, no browser - just deploy and claim later.
8 min readOpen Design is trending because it turns Claude Code, Codex, Cursor, Gemini, and other CLIs into a design engine. The useful lesson is not design automation. It is artifact-first agent wrappers.
8 min readTechnical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's Python SDK for building production agent systems. Tool use, guardrails, agent handoffs, and orchestration. R...
View ToolLightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View ToolOpen-source AI orchestration framework by deepset. Modular pipelines for RAG, agents, semantic search, and multimodal ap...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolSpec out AI agents, run them overnight, wake up to a verified GitHub repo.
View AppCompare AI coding agents on reproducible tasks with scored, shareable runs.
View AppGenerate brand systems, launch copy, and reusable creative direction from one brief.
View AppConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI AgentsA complete, citation-backed Claude Code course with setup, prompting systems, MCP, CI, security, cost controls, and capstone workflows.
ai-development
Boost Your Productivity with Augment Code's Remote Agent Feature Sign up: https://www.augment.new/ In this video, learn how to utilize Augment Code's new remote agent feature within your...

Build Anything with Vercel, the Agentic Infrastructure Stack Check out Vercel: https://vercel.plug.dev/cwBLgfW The video shows a behind-the-scenes walkthrough of how the creator rapidly builds and d

Check out Trae here! https://tinyurl.com/2f8rw4vm In this video, we dive into @Trae_ai a newly launched AI IDE packed with innovative features. I provide a comprehensive demonstration...

DeepReinforce AI released Ornith-1.0, a family of open-source coding models claiming self-improvement. The HN thread rev...

A developer fed 266MB of DICOM MRI data to Claude Code Opus for a second opinion on a shoulder diagnosis. The AI disagre...

Semgrep's security research team benchmarked LLMs on IDOR vulnerability detection. The open-weight GLM 5.2 beat Claude C...

Filippo Valsorda argues that LLMs have ended the era of treating security researchers with kid gloves. When anyone can d...

Baidu releases Unlimited OCR, an open-source vision-language model that parses 100+ page documents in a single pass with...

Dan Abramov's explainer on ATProto architecture is making the rounds. The core insight: Bluesky's protocol separates hos...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.