The Log Is the Agent: Event Sourcing Comes to AI Systems

A short but provocative paper appeared on arXiv this week: "The Log is the Agent" by Yohei Nakajima (creator of BabyAGI). The core claim is simple but has significant implications for how we build AI agent systems.

Most agent frameworks treat logging as an afterthought - something you bolt on for debugging and compliance. Nakajima argues we should flip this: make the append-only event log the source of truth, and derive all agent state from that log.

The Core Idea

Traditional agent architectures look like this:

LLM → State → Tools → World
         ↓
       Logs (optional audit trail)

The "log is the agent" architecture inverts this:

Event Log (source of truth)
     ↓
Graph State (deterministic projection)
     ↓
Behaviors (react to graph changes, emit new events)

The key properties this enables:

Deterministic replay - you can reconstruct any agent run from its event log
Cheap forking - branch at any point without re-executing the shared prefix
Full lineage - trace from high-level goals down to individual model calls

The paper introduces ActiveGraph, a runtime that implements this pattern. The graph is never mutated directly - behaviors react to graph changes and emit new events, which get appended to the log. The working graph is just a projection of the log state.

What HN Is Saying

The Hacker News discussion is notably technical, with several experienced developers recognizing the pattern immediately.

"The AI folks have discovered CQRS?": Multiple commenters pointed out that this is essentially event sourcing and CQRS, patterns that have been standard in distributed systems for over a decade. One commenter wryly noted that the paper is "presenting common ideas as novel without thinking through existing problems."

Practical implementations: Several developers shared their own agent harnesses that use similar patterns. Lightspeed stores all context-affecting events in an event log, making forking trivial - just set a pointer to another sequence number. Another commenter is building a similar system on Elixir/Ash.

Cost concerns: A valid critique: "wouldn't feeding that log for each request/response iteration get expensive really fast?" This is the elephant in the room. Event sourcing traditionally works well because replaying events is cheap. But LLM calls are expensive - replaying a conversation means paying for all those tokens again. The paper stores model responses in the log to avoid re-generation, but that's replay-as-recording, not true deterministic replay.

Write-ahead logs from databases: One commenter with database experience noted that WAL (write-ahead log) patterns provide a natural interface between speculative agent work and durable world mutations. This connects to broader work on stealing database ideas for AI agents.

Skepticism about the paper itself: Some commenters were unimpressed by the paper's structure - "we discuss without claiming to demonstrate" raised eyebrows. Others noted the author is a VC rather than a researcher, though his BabyAGI work has been influential in the agent space.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

MCP tools need a shared board, not another transcript

Jul 5, 2026 • 5 min read

Program-as-Weights Turns Prompts Into Local Fuzzy Functions

Jul 5, 2026 • 7 min read

Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

Jul 4, 2026 • 8 min read

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Jul 4, 2026 • 8 min read

Why This Matters for Agent Developers

Even if the core insight isn't novel, the paper articulates something important: most agent frameworks get state management wrong.

Here's the problem. You start an agent session. It makes some tool calls. You want to:

Fork the session to try a different approach
Replay a failed run to debug it
Compact the context window without losing fidelity

With most frameworks, this is surprisingly hard. The session state is scattered across:

The message history (mutable, often compacted)
Tool call results (sometimes stored, sometimes not)
Internal state (often in-memory only)
External side effects (irreversible)

If you store raw events from the start - every user message, every assistant response, every tool call and result - you can derive any projection you need. Want to fork? Just point to an earlier sequence number. Want to compact? Generate a summary event and start a new log segment. Want to replay? Feed the events back through your projection logic.

// Event log approach
interface AgentEvent {
  id: string;
  timestamp: number;
  type: 'user_message' | 'assistant_message' | 'tool_call' | 'tool_result' | 'compaction';
  payload: unknown;
  parentId?: string; // For forking
}

// Current state is always derived
function projectState(events: AgentEvent[]): ConversationState {
  return events.reduce((state, event) => {
    switch (event.type) {
      case 'user_message':
        return { ...state, messages: [...state.messages, event.payload] };
      case 'compaction':
        return { ...state, messages: [event.payload.summary] };
      // etc.
    }
  }, initialState);
}

The Deeper Connection to Databases

The paper's insight connects to a broader pattern: AI agents are essentially distributed systems with unreliable components (the LLM), and we should apply distributed systems patterns to them.

The log-centric architecture echoes several database concepts:

Write-ahead logging - durability through append-only logs
Event sourcing - state as projection of events
MVCC - multiple versions (branches) from shared history
Snapshot isolation - consistent reads at a point in time

As agents get more complex - longer runs, more tools, multi-step planning - these patterns become essential. You can't debug a 2-hour agent run by reading through 500 messages. You need structured replay, causal tracing, and the ability to "what if" from any point.

Practical Implications

If you're building agent systems, consider:

Store raw events, not just messages - tool calls, results, state changes, everything
Make your log append-only - never mutate past events, only append corrections
Derive context windows from logs - don't mutate the message array directly
Design for replay - can you reconstruct any session from its log?
Think about forking - how would you branch at turn 47 of a 100-turn session?

The paper's ActiveGraph implementation is available to try, though several commenters noted it's early-stage and the author's website doesn't even have a valid SSL cert.

Whether or not you use ActiveGraph, the log-centric mental model is worth internalizing. As one commenter put it: "This paper points at an idea, but it's really only legible if you have a more developed version of the idea already."

The "Just Event Sourcing" Critique

The most common HN response was some variation of "this is just event sourcing." And they're right - the patterns are well-established. The contribution isn't inventing something new; it's applying known patterns to a domain where they're surprisingly underused.

Most agent frameworks are still in the "mutate state directly" paradigm. They store message histories as mutable arrays, compact them in place, and lose fidelity in the process. The log-centric approach is more work upfront but pays dividends in debuggability, reproducibility, and composability.

The AI community has a habit of rediscovering established CS patterns. Sometimes that's frustrating. Sometimes it's necessary - the old patterns need to be re-articulated for a new context. This paper does the latter, even if imperfectly.

Sources

The Log is the Agent - Original arXiv paper
Hacker News discussion - 34 comments
ActiveGraph - Paper's implementation
Lightspeed agent harness - Similar pattern in practice
Event Sourcing - Martin Fowler's canonical explanation

The Core Idea

Traditional agent architectures look like this:

LLM → State → Tools → World
         ↓
       Logs (optional audit trail)

The "log is the agent" architecture inverts this:

Event Log (source of truth)
     ↓
Graph State (deterministic projection)
     ↓
Behaviors (react to graph changes, emit new events)

The key properties this enables:

Deterministic replay - you can reconstruct any agent run from its event log
Cheap forking - branch at any point without re-executing the shared prefix
Full lineage - trace from high-level goals down to individual model calls

What HN Is Saying

The Hacker News discussion is notably technical, with several experienced developers recognizing the pattern immediately.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

MCP tools need a shared board, not another transcript

Jul 5, 2026 • 5 min read

Program-as-Weights Turns Prompts Into Local Fuzzy Functions

Jul 5, 2026 • 7 min read

Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

Jul 4, 2026 • 8 min read

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Jul 4, 2026 • 8 min read

Why This Matters for Agent Developers

Even if the core insight isn't novel, the paper articulates something important: most agent frameworks get state management wrong.

Here's the problem. You start an agent session. It makes some tool calls. You want to:

Fork the session to try a different approach
Replay a failed run to debug it
Compact the context window without losing fidelity

With most frameworks, this is surprisingly hard. The session state is scattered across:

The message history (mutable, often compacted)
Tool call results (sometimes stored, sometimes not)
Internal state (often in-memory only)
External side effects (irreversible)

// Event log approach
interface AgentEvent {
  id: string;
  timestamp: number;
  type: 'user_message' | 'assistant_message' | 'tool_call' | 'tool_result' | 'compaction';
  payload: unknown;
  parentId?: string; // For forking
}

// Current state is always derived
function projectState(events: AgentEvent[]): ConversationState {
  return events.reduce((state, event) => {
    switch (event.type) {
      case 'user_message':
        return { ...state, messages: [...state.messages, event.payload] };
      case 'compaction':
        return { ...state, messages: [event.payload.summary] };
      // etc.
    }
  }, initialState);
}

The Deeper Connection to Databases

The paper's insight connects to a broader pattern: AI agents are essentially distributed systems with unreliable components (the LLM), and we should apply distributed systems patterns to them.

The log-centric architecture echoes several database concepts:

Write-ahead logging - durability through append-only logs
Event sourcing - state as projection of events
MVCC - multiple versions (branches) from shared history
Snapshot isolation - consistent reads at a point in time

Practical Implications

If you're building agent systems, consider:

Store raw events, not just messages - tool calls, results, state changes, everything
Make your log append-only - never mutate past events, only append corrections
Derive context windows from logs - don't mutate the message array directly
Design for replay - can you reconstruct any session from its log?
Think about forking - how would you branch at turn 47 of a 100-turn session?

The paper's ActiveGraph implementation is available to try, though several commenters noted it's early-stage and the author's website doesn't even have a valid SSL cert.

The "Just Event Sourcing" Critique

Sources

The Log is the Agent - Original arXiv paper
Hacker News discussion - 34 comments
ActiveGraph - Paper's implementation
Lightspeed agent harness - Similar pattern in practice
Event Sourcing - Martin Fowler's canonical explanation

The Core Idea

What HN Is Saying

MCP tools need a shared board, not another transcript

Program-as-Weights Turns Prompts Into Local Fuzzy Functions

Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Why This Matters for Agent Developers

The Deeper Connection to Databases

Practical Implications

The "Just Event Sourcing" Critique

Sources

There Are No Instances in ATProto - Dan Abramov Explains the Architecture

Cloudflare Now Lets AI Agents Deploy Workers Without Signup

Open Design Shows the Next Agent Wrapper

Related Tools

Claude Agent SDK

OpenAI Agents SDK

Haystack

Claude Code

Apps from Developers Digest

Overnight Agents

Agent Benchmark Lab

Brand Studio

Related Guides

Claude Code Setup Guide

MCP Servers Explained

Claude Code Complete Course

Related Videos

Introducing Augment Remote Agent: Parallel Autonomous AI Agents

Agents 101: How to Build and Deploy Anything with AI Agents

TRAE: Custom AI Agents That Actually Understand Your Codebase

Related Posts

Ornith-1.0: What an Open Source Self-Improving Coding Model Actually Means

Using Claude Code for a Second Opinion on MRI Scans - What Actually Happened

GLM 5.2 Outperforms Claude Code on Semgrep's IDOR Vulnerability Benchmarks

Vulnerability Reports Are Not Special Anymore

Unlimited OCR: Baidu's Open-Source Solution for Long Document Parsing

There Are No Instances in ATProto - Dan Abramov Explains the Architecture

Build with the member tools

Get Smarter About AI Dev

The Core Idea

What HN Is Saying

MCP tools need a shared board, not another transcript

Program-as-Weights Turns Prompts Into Local Fuzzy Functions

Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

Dan Luu's Agentic Coding Notes Point to the Real Bottleneck

Why This Matters for Agent Developers

The Deeper Connection to Databases

Practical Implications

The "Just Event Sourcing" Critique

Sources

There Are No Instances in ATProto - Dan Abramov Explains the Architecture

Cloudflare Now Lets AI Agents Deploy Workers Without Signup

Open Design Shows the Next Agent Wrapper

Related Tools

Claude Agent SDK

OpenAI Agents SDK

Haystack

Claude Code

Apps from Developers Digest

Overnight Agents

Agent Benchmark Lab

Brand Studio

Related Guides

Claude Code Setup Guide

MCP Servers Explained

Claude Code Complete Course

Related Videos

Introducing Augment Remote Agent: Parallel Autonomous AI Agents

Agents 101: How to Build and Deploy Anything with AI Agents

TRAE: Custom AI Agents That Actually Understand Your Codebase

Related Posts

Ornith-1.0: What an Open Source Self-Improving Coding Model Actually Means

Using Claude Code for a Second Opinion on MRI Scans - What Actually Happened

GLM 5.2 Outperforms Claude Code on Semgrep's IDOR Vulnerability Benchmarks

Vulnerability Reports Are Not Special Anymore

Unlimited OCR: Baidu's Open-Source Solution for Long Document Parsing

There Are No Instances in ATProto - Dan Abramov Explains the Architecture

Build with the member tools

Get Smarter About AI Dev