MCP Tool Search - Claude Code
Deferred tool loading reduces context overhead for large MCP suites.
MCP tool search solves the "my MCP server has 80 tools" problem. Tools are loaded on demand instead of all at once.
What it does
When a server has tool search enabled, Claude Code sees a searchable index rather than every tool loaded into context. The model queries the index when it needs a tool, loads the matching tool's schema, and calls it. You keep the expressive power of large tool suites without paying tokens for every tool, every turn.
When to use it
- Any MCP server exposing more than a handful of tools.
- Multi-server setups where combined tool count bloats context.
- Cost-sensitive workflows where tool schemas were eating the budget.
- Large internal platforms with hundreds of operations.
Gotchas
- Tool search adds a small latency for the first call to each tool.
- Search queries affect tool selection quality. Servers should provide good descriptions.
- Not every MCP server supports deferred loading yet - check the server docs.
Official docs: https://code.claude.com/docs/en/mcp.md#scale-with-mcp-tool-search
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
Was this helpful?




