One Endpoint, Every Capability: A Reference Architecture for Progressive Disclosure

Q: Can I use this from a harness other than Claude Code?

Yes. MCP is a client-neutral protocol, so any [compliant client](https://modelcontextprotocol.io) discovers and calls the tools the same way. The skills themselves are plain `SKILL.md` markdown, an open format, so nothing about the pattern is tied to one harness.

Q: How do I try it?

Create a `dd_live_` API key, point an MCP client at the `/api/mcp` endpoint with it as a Bearer token, and call the tools. You can also browse the same skill and file catalog by hand at [/library](/library), and the full tool reference lives in the [docs](/docs).

Two earlier posts built up one idea in stages. The first argued that SKILL.md and the Model Context Protocol solve two halves of the same problem, and that the useful move is to serve skills over MCP so an agent pays context cost only in proportion to what a task needs. The second removed two constraints from that design: skills no longer had to be ours, and a skill file no longer had to be copied in ahead of time. A file could be a link, fetched only at the moment an agent reached for it.

This post is the capstone. It is not a new feature so much as the shape the whole platform settled into once those pieces were in place. The claim is narrow and, I think, useful: skills, files, memory, and generation do not need four separate integrations. They need one endpoint, one auth surface, one billing surface, and one organizing principle applied consistently across all of them. That principle is tiered disclosure. What follows is the architecture and the reasoning, because the architecture is the interesting part, not any single tool.

The one endpoint

Everything a member's agents can do lives at a single streamable HTTP MCP endpoint: /api/mcp. Point any MCP-capable client at that URL with a dd_live_ API key and the tools appear. There is no second endpoint for skills, no separate service for files, no different auth for generation. The full catalog is documented in the repo as a canonical reference, but the shape is easy to hold in your head, because it is four families of capability on one surface.

The first family is generation: generate_image and generate_voice. These are the metered tools, and they are the only ones that cost credits. Each one does the work, persists the result to the caller's gallery, and hands back a durable URL, so a generation is not a throwaway artifact but a file that now exists in the account.

The second family is files and assets: list_folders, list_files, get_file, and list_assets. This is where everything a member uploads or generates becomes reachable as context. An agent can list what is there and pull one file's contents on demand.

The third family is memory: save_memory, list_memories, and search_memories. Durable notes and links that survive across sessions and machines, so an agent can persist a decision in one run and recall it in the next, on a different computer, weeks later.

The fourth family is the library: list_skills, get_skill, get_skill_file, plus the sibling tools for copyable subagent definitions and design contracts. This is the skills-over-MCP surface the earlier posts built, now including a member's own authored skills scoped to their key.

Four families, one endpoint. The reason that consolidation matters is not tidiness. It is that a single endpoint with a single key is the difference between an agent that can reach your whole working context and an agent that can reach whichever one integration you wired up this week.

Tiered disclosure is the organizing principle

The thing that keeps four capability families from collapsing into an unusable wall of tool schemas is that they all follow the same loading discipline. Anthropic's Agent Skills named it for knowledge packaging: progressive disclosure, where the agent sees short descriptions first, pulls a full body only for the item it chose, and reads deeper reference material only as the work demands. We apply that same staging to every family on the endpoint.

For skills it is three tiers. list_skills returns a lean index, a slug and a one-line description each, cheap enough to hold a hundred of them in context. get_skill returns one skill's body plus a manifest of its files, paths and one-line purposes, still no file contents. get_skill_file returns the raw contents of exactly one file, and for a linked file it fetches the remote source at that moment. Three calls, each one paying only for the depth it reached.

For files it is two tiers, because a file is its own unit and needs no manifest in between. list_files is the lean index: id, name, kind, content type, and size, no URLs and no contents. get_file pulls one file on demand, returning the text inline for a textual file, capped so a large file cannot blow the context budget, or a durable URL for a binary. The pattern is identical to skills, just collapsed by one tier because the shape of the data allows it.

Memory bends the rule deliberately, and the exception is worth stating because it clarifies the rule. There is no get_memory item tier; list_memories and search_memories return the full note body inline. That is intentional. Notes are small recall items, and the entire point of memory is one-call recall. Forcing a second fetch to read a note you already found would be disclosure theater, cost without benefit. The discipline is not "always add tiers." It is "pay context in proportion to what the task needs," and for a short note the proportional cost is the whole note.

The anti-pattern this avoids is the flat server: fifty tools whose full schemas load before the agent has decided anything, or a single tool that dumps every file and every skill body in one response. Either one hands the model tens of thousands of tokens describing things the current task will never touch. A small index in front of on-demand fetches gives the same reach at a fraction of the standing cost.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

The Economics of Agent Fleets: Fable 5 Orchestrators, Sonnet 5 Workers

Jul 1, 2026 • 8 min read

Agents 101: How to Build and Deploy Anything with AI Agents

Jul 1, 2026 • 7 min read

Where Should Your AI Agent Run Code: E2B vs Daytona vs Modal vs Cloudflare vs Vercel Sandbox

Jul 1, 2026 • 7 min read

Text-to-Speech APIs for Developers in 2026: What to Actually Use

Jul 1, 2026 • 8 min read

The same tools, three front doors

Here is the part that turns a tidy API into a coordination substrate. The tools on /api/mcp are not a special MCP-only surface. They are the same capabilities the platform exposes everywhere, reached three ways.

An external agent reaches them over MCP. Point Claude, Cursor, or any MCP client at the endpoint with a key, and list_skills, get_file, and the rest are callable tools the model can choose.

The in-product chat reaches the same capabilities from the inside. When a member talks to the assistant in the dashboard, the model is calling the same underlying functions, routed through the AI SDK's tool-calling machinery. The chat is not a separate implementation of image generation or memory; it is another caller of the one that already exists.

And a script reaches them over plain HTTP. The REST API and the MCP endpoint are two projections of the same credit-metered capabilities, so a CLI or a cron job hits the same functions with the same key that an agent uses interactively.

One capability, three front doors. That is what makes the architecture worth calling a reference architecture rather than a collection of endpoints. A file your agent generates over MCP at 2am is in the gallery your chat can reference at 9am and the CLI can download at noon, because there was only ever one file and one place it lived. The surfaces differ; the substrate does not.

Auth and credits are what make it shared and safe

None of this works as a coordination layer without the two scoping decisions underneath it, and they are almost boring, which is the point.

Every tool call on the MCP transport resolves its owner from the API key. There is no session to manage, because the key is the identity. That resolved owner id scopes everything: list_files returns your files, get_skill includes your authored skills, search_memories searches your notes. One member's agents cannot reach another member's private context, and they do not have to be told not to; the scoping is structural, applied once at the transport boundary rather than re-checked in every tool.

Credits are the other half. A single universal balance meters the paid actions, and because the key maps to a stable owner id, that balance is the same whether the spend comes from the MCP endpoint, the in-product chat, or a script. Buy credits once, spend them from any front door. The free tools, everything in files, memory, and the library, cost nothing, because their cost is storage and lookups, not inference. The metered tools charge from one source of truth so the price shown and the price charged cannot drift.

Put those two together and you have the quiet precondition for a fleet: a shared context substrate that is scoped per owner and billed once, reachable identically from every surface an agent might live on.

Why this is the shape for coordinating agents

The reason I keep returning to this design is that coordinating a fleet of agents is, in practice, a context problem before it is an orchestration problem. Agents do not fail to cooperate because they lack a message bus. They fail because each one holds a slightly different, slightly stale picture of the world, copied onto its disk at a different moment.

A single endpoint with tiered disclosure fixes that at the root. The runbook is a skill, one row in an index until an agent needs it, updated in one place so the whole fleet has the fix on its next get_skill call. The design doc your teammate uploaded is a file any agent can list and pull. The decision one agent recorded is a memory another agent can search. Nobody re-pastes, nobody re-syncs, and nothing drifts, because there is one library and every agent discovers it the same way. When we ran a fleet of agents for a day to rebuild this site, the thing that held the day together was exactly this: shared, verifiable context every agent could reach on the same terms.

That is the whole architecture. Two open standards each solved one half of the problem, and the combination, applied consistently across skills, files, memory, and generation on one endpoint, is the interesting part. You can browse the catalog by hand at /library, read the endpoint reference in the developer docs, and point your own agents at it today. The next post carries the same architecture to member-authored roles in Agent Studio.

FAQ

What is the difference between the MCP endpoint and the REST API?

They are two projections of the same credit-metered capabilities. The REST API is for scripts and servers calling over plain HTTP; the MCP endpoint exposes the same underlying functions as model-callable tools for an agent. Both authenticate with the same dd_live_ key and draw down the same credit balance, so the choice is about which client is calling, not which features are available.

Why put files and memory behind progressive disclosure instead of just returning everything?

Because returning everything spends context on data the current task will never read. A lean index (list_files, list_skills) plus an on-demand fetch (get_file, get_skill_file) lets an agent hold a large working set cheaply and pay full cost only for the one item it opens. The exception is memory notes, which are small enough that returning the body inline is the intended behavior rather than a leak.

How is one member's context kept separate from another's?

Every tool call resolves its owner from the API key at the transport boundary, and that owner id scopes every per-user tool. A caller only ever sees their own files, skills, and memories. Public content, like another member's explicitly public skill, is the documented exception, and it is opt-in.

Can I use this from a harness other than Claude Code?

Yes. MCP is a client-neutral protocol, so any compliant client discovers and calls the tools the same way. The skills themselves are plain SKILL.md markdown, an open format, so nothing about the pattern is tied to one harness.

How do I try it?

Create a dd_live_ API key, point an MCP client at the /api/mcp endpoint with it as a Bearer token, and call the tools. You can also browse the same skill and file catalog by hand at /library, and the full tool reference lives in the docs.

The one endpoint

Tiered disclosure is the organizing principle

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

The Economics of Agent Fleets: Fable 5 Orchestrators, Sonnet 5 Workers

Jul 1, 2026 • 8 min read

Agents 101: How to Build and Deploy Anything with AI Agents

Jul 1, 2026 • 7 min read

Where Should Your AI Agent Run Code: E2B vs Daytona vs Modal vs Cloudflare vs Vercel Sandbox

Jul 1, 2026 • 7 min read

Text-to-Speech APIs for Developers in 2026: What to Actually Use

Jul 1, 2026 • 8 min read

The same tools, three front doors

An external agent reaches them over MCP. Point Claude, Cursor, or any MCP client at the endpoint with a key, and list_skills, get_file, and the rest are callable tools the model can choose.

Auth and credits are what make it shared and safe

None of this works as a coordination layer without the two scoping decisions underneath it, and they are almost boring, which is the point.

The one endpoint

Tiered disclosure is the organizing principle

The Economics of Agent Fleets: Fable 5 Orchestrators, Sonnet 5 Workers

Agents 101: How to Build and Deploy Anything with AI Agents

Where Should Your AI Agent Run Code: E2B vs Daytona vs Modal vs Cloudflare vs Vercel Sandbox

Text-to-Speech APIs for Developers in 2026: What to Actually Use

The same tools, three front doors

Auth and credits are what make it shared and safe

Why this is the shape for coordinating agents

FAQ

What is the difference between the MCP endpoint and the REST API?

Why put files and memory behind progressive disclosure instead of just returning everything?

How is one member's context kept separate from another's?

Can I use this from a harness other than Claude Code?

How do I try it?

Skills Delivered Over MCP: Why Progressive Disclosure Is the Missing Piece of Both Standards

Agent Studio: Authoring the Roles, Not Just the Knowledge

Linked Context: When a Skill Can Point at the Whole Web

Related Tools

Composio

AgentCanvas

Apps from Developers Digest

MCP Lens

Related Guides

MCP Resources - Claude Code

Claude Code Setup Guide

MCP Servers Explained

Related Videos

Progressive Disclosure in Claude Code

Related Posts

Agent Studio: Authoring the Roles, Not Just the Knowledge

Linked Context: When a Skill Can Point at the Whole Web

Skills Delivered Over MCP: Why Progressive Disclosure Is the Missing Piece of Both Standards

The MCP 2026-07-28 Rewrite: What Breaks and How to Migrate

Point Your Agent at Developers Digest

Agent Identity Is the Missing Security Layer for AI Workflows

Build with the member tools

Get Smarter About AI Dev

The one endpoint

Tiered disclosure is the organizing principle

The Economics of Agent Fleets: Fable 5 Orchestrators, Sonnet 5 Workers

Agents 101: How to Build and Deploy Anything with AI Agents

Where Should Your AI Agent Run Code: E2B vs Daytona vs Modal vs Cloudflare vs Vercel Sandbox

Text-to-Speech APIs for Developers in 2026: What to Actually Use

The same tools, three front doors

Auth and credits are what make it shared and safe

Why this is the shape for coordinating agents

FAQ

What is the difference between the MCP endpoint and the REST API?

Why put files and memory behind progressive disclosure instead of just returning everything?

How is one member's context kept separate from another's?

Can I use this from a harness other than Claude Code?

How do I try it?

Skills Delivered Over MCP: Why Progressive Disclosure Is the Missing Piece of Both Standards

Agent Studio: Authoring the Roles, Not Just the Knowledge

Linked Context: When a Skill Can Point at the Whole Web

Related Tools

Composio

AgentCanvas

Apps from Developers Digest

MCP Lens

Related Guides

MCP Resources - Claude Code

Claude Code Setup Guide

MCP Servers Explained

Related Videos

Progressive Disclosure in Claude Code

Related Posts

Agent Studio: Authoring the Roles, Not Just the Knowledge

Linked Context: When a Skill Can Point at the Whole Web

Skills Delivered Over MCP: Why Progressive Disclosure Is the Missing Piece of Both Standards

The MCP 2026-07-28 Rewrite: What Breaks and How to Migrate

Point Your Agent at Developers Digest

Agent Identity Is the Missing Security Layer for AI Workflows

Build with the member tools

Get Smarter About AI Dev