Agent Replays with TraceTrail: Loom for Agent Runs

The problem: agent runs are a black box

You give an AI coding agent a task. Twenty minutes later it comes back with a diff, a passing test, and a vague summary of what it did. If the diff is right, you ship it and move on. If something is off, you have a problem.

The actual run lives inside a transcript file somewhere on disk. For Claude Code that is a JSONL under ~/.claude/projects/<dir>/<sid>.jsonl. Hundreds of lines of message blocks, tool calls, tool results, and usage records. Readable, technically. Useful, not really.

So you do one of three bad things. You scroll the terminal scrollback until your eyes glaze over. You paste the JSONL into a chat window and ask another model to summarize it. Or you give up and re-run the task with extra logging, which means the original failure is now gone.

This is the gap. Agent runs have no shareable artifact. There is no link you can drop into a thread that says "here is exactly what the agent did, step by step, with the tool calls and the token spend, in a UI a human can scan in thirty seconds."

That is what TraceTrail is. The missing share link for AI coding agents.

What TraceTrail does in one sentence

Upload an agent transcript. Get a public /r/<id> URL. Anyone with the link can replay the run as a stepped timeline.

The mental model is Loom, but for agents instead of screen recordings. You ran something private. You want to show somebody what happened. You generate a link and paste it.

Install and upload

TraceTrail is a Next.js app backed by Neon Postgres and Clerk. The MVP is intentionally small. Three routes, one parser, one timeline view.

If you are running it locally, the setup is the standard shape:

git clone <your-tracetrail-repo>
cd tracetrail
pnpm install
cp .env.example .env.local   # fill in Clerk + DATABASE_URL
psql "$DATABASE_URL" -f drizzle/0000_initial.sql
pnpm dev

Open http://localhost:3000, sign in through Clerk, and you land on a single upload form. Drag a transcript onto it, or pick a file. The accepted shapes are:

Claude Code JSONL. One JSON object per line, with type, message.role, and message.content blocks. This is the format Claude Code already writes to ~/.claude/projects/.
JSON array. A plain array of { role, content } message objects. This is what most generic agent frameworks emit.
Single JSON object with an events: [...] field. For frameworks that wrap their runs in metadata.

Behind the form is POST /api/upload. It is auth-gated: you have to be signed in to push a transcript. The endpoint returns { id, url }. The url is your share link.

The replay route, GET /r/[id], is public on purpose. Once a run is uploaded, anyone with the link can watch it. This is the Loom tradeoff. Public-by-default is the whole point of a share link. If a run contains anything sensitive, do not upload it. There is no redaction in the MVP and there is no delete UX yet either.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

The replay page is a stepped timeline. Each event in the transcript becomes one step. The parser at src/lib/parse.ts flattens the raw JSONL into four event kinds:

Messages. User, assistant, or system text. The role is normalized so generic transcripts and Claude Code transcripts render identically.
Tool calls. Each tool_use block becomes its own step, with the tool name and the input JSON. Bash commands, file reads, edits, web fetches, MCP calls. All of it.
Tool results. The output that came back. Truncated to 8 KB per result so a single noisy ls does not balloon the page. Errors are flagged.
System events. Init blocks, hook outputs, anything tagged system.

At the top of the page you get the totals: input tokens, output tokens, message count, tool call count. Token totals only show up when the source transcript actually included usage.input_tokens and usage.output_tokens. There is no tokenizer fallback. If your framework does not record usage, that section will be zeros, and that is honest.

The visual job of the timeline is just to make the run scannable. You should be able to skim the steps, see where the agent went off the rails, expand the tool result that matters, and close the tab. No video player to scrub through. No chat UI to scroll. Just a list of what happened, in order.

Use cases

Once you have a share link primitive, a bunch of workflows that used to be painful become one paste.

Debugging your own runs. When a long agent run produces a wrong answer, you upload the JSONL and look for the moment things went sideways. Usually it is one bad tool result that the agent then built ten more steps on top of. Seeing the timeline at a glance is faster than grep-ing the JSONL.

Onboarding teammates. New person joins. You want to show them how Claude Code actually works in your repo. You drop three replay links into the onboarding doc: a clean run, a recovered run, a failed run. They scrub through in five minutes and get more context than an hour of pairing.

Showing clients or stakeholders. Non-engineers do not want to watch a screen recording of you typing. They want to see "the AI did these eight steps and produced this PR." A replay link is the right object for that conversation. It is also the right object to attach to a status update.

Evaluating sub-agents. If you run agent teams, you have N parallel runs per task. Having a stable URL per run lets you compare them the way you would compare videos in the compare hub. Pick the cleanest run. Link it. Move on.

Pairing with another agent. Tools like Promptlock version the prompts that go in. TraceTrail captures the runs that come out. Together they close a loop: you can change a prompt, replay the resulting agent run, link the replay back to the prompt version, and have a real audit trail.

What TraceTrail is not yet

The MVP is deliberately narrow. A few things people will ask for that are not in this version:

No redaction. The parser does not scan for secrets, file paths, or PII. Whatever is in your JSONL ends up on the public page. Treat the upload form like gist.github.com: only paste what you would paste into a public gist.
No delete UX. There is no dashboard to list your uploads or revoke a link. The id is unguessable, but the design assumption is "public forever once uploaded."
No streaming uploads. The /api/upload route buffers the whole file in memory with a 10 MB cap. Long agent runs in the tens of MB will fail. Chunked ingest is on the list.
No tokenizer fallback. Token totals come from the transcript or they come back zero. No re-tokenization on the server.
No CLI uploader, no embeds, no R2 artifacts. Web upload only, web replay only, in this version.

These are all known. The first version ships the share link primitive and nothing else, because the share link is the whole product.

Try it

If you run any agent that writes a transcript to disk, you can use TraceTrail today. The fastest path is to grab one of your existing Claude Code session files, sign in, drag it onto the form, and paste the resulting URL into your team chat. That is the entire onboarding.

For deeper agent tooling, pair it with the patterns in Prompt Versioning with Promptlock and the compare hub. Versioned prompts on the way in. Replayable runs on the way out. Two share-link primitives that finally make agent work feel like normal software work.

Screenshots TODO: upload form, replay timeline, tool call expanded view, totals header.

The problem: agent runs are a black box

What TraceTrail does in one sentence

Install and upload

What the share link reveals

Use cases

What TraceTrail is not yet

Try it

Comments

Related Tools

Claude Code

OpenAI Codex

Windsurf

Devin

Related Guides

AGENTS.md - Claude Code

Subagents - Claude Code

Claude Code Setup Guide

Related Videos

Replit Agent 4: Design-to-Full App with Parallel Agents & Infinite Canvas

Self Improving Agents in 5 Minutes

Minimax M2.7: Self-Evolving Agent Model

Related Posts

Claude Code Hooks with Hookyard: npm install for Hooks

Six More Tools for the Agent Infrastructure Stack

Best Claude Code Skills in 2026: A Curated Directory

Get Smarter About AI Dev

The problem: agent runs are a black box

What TraceTrail does in one sentence

Install and upload

What the share link reveals

Use cases

What TraceTrail is not yet

Try it

Comments

Related Tools

Claude Code

OpenAI Codex

Windsurf

Devin

Related Guides

AGENTS.md - Claude Code

Subagents - Claude Code

Claude Code Setup Guide

Related Videos

Replit Agent 4: Design-to-Full App with Parallel Agents & Infinite Canvas

Self Improving Agents in 5 Minutes

Minimax M2.7: Self-Evolving Agent Model

Related Posts

Claude Code Hooks with Hookyard: npm install for Hooks

Six More Tools for the Agent Infrastructure Stack

Best Claude Code Skills in 2026: A Curated Directory

Get Smarter About AI Dev