TL;DR
Zilliz's claude-context MCP server lets Claude Code search millions of lines semantically instead of loading entire directories - cutting token usage by roughly 40%.
Read next
Zilliz's claude-context MCP lets Claude Code search your entire codebase semantically without loading every file - reaching 9,500 stars with a 3,300-star week.
6 min readzilliztech/claude-context is an MCP server that indexes your entire codebase with hybrid vector search, letting Claude Code find relevant code without loading whole directories. It hit 8.8k stars and is trending on both daily and weekly GitHub charts.
6 min readzilliztech/claude-context adds semantic code search to Claude Code via MCP, letting you query your entire codebase in natural language while cutting token usage by ~40%.
6 min readzilliztech/claude-context crossed 10,000 stars this week, picking up more than 3,700 in seven days. That velocity reflects a real pain point: the bigger your codebase, the harder it is for an AI coding agent to find the right code without you manually pasting files into context. The repo hit trending at a moment when teams are scaling Claude Code from small scripts to production monorepos, and the naive approach - --context-dir on a directory with hundreds of files - gets expensive fast. A hybrid semantic search layer on top of a vector database is one of the cleaner architectural answers to that problem, and this project wraps it in a single MCP server.
claude-context is a Model Context Protocol server built by Zilliz (the company behind the Milvus vector database). It indexes your codebase into a vector store, then exposes four tools to any MCP-compatible client:
The hybrid search combines BM25 (keyword matching) and dense vector (semantic) retrieval. That pairing matters: pure vector search misses exact function names; pure keyword search misses semantically related patterns. Combining them gives you better recall across both.
Under the hood, the project uses AST-based chunking rather than naive line splitting. It parses the source into language-aware chunks, which means a class definition stays together rather than getting split across embedding boundaries. Incremental indexing uses Merkle trees to re-index only the files that changed since the last run - useful on active codebases where a full re-index on every session would be impractical.
Supported languages include TypeScript, JavaScript, Python, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, and Markdown. The package is available as @zilliz/claude-context-mcp on npm and runs through npx, so there is no global install required.
The project claims roughly 40% token reduction compared to loading entire directories. That claim comes from their own evaluation, so take it as directionally useful rather than a precise benchmark.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 28, 2026 • 9 min read
Apr 28, 2026 • 8 min read
Apr 28, 2026 • 7 min read
Apr 28, 2026 • 9 min read
The quickest path is a single claude mcp add command inside your project directory:
claude mcp add claude-context \
-e OPENAI_API_KEY=sk-your-openai-api-key \
-e MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint \
-e MILVUS_TOKEN=your-zilliz-cloud-api-key \
-- npx @zilliz/claude-context-mcp@latest
You will need a Zilliz Cloud account (free tier is available) for MILVUS_ADDRESS and MILVUS_TOKEN, and an OpenAI API key for the default embedding model (text-embedding-3-small).
Once configured, open Claude Code in your project and type:
Index this codebase
Claude will call the index_codebase tool. When indexing completes, you can search:
Find functions that handle user authentication
Claude retrieves the relevant snippets from the vector index rather than reading every file. If you prefer Cursor, the same server works via ~/.cursor/mcp.json. The project also documents setups for Claude Desktop, Codex CLI, Gemini CLI, VS Code, Windsurf, and about a dozen others.
For programmatic use, the core package is available separately as @zilliz/claude-context-core with a clean API for indexing and searching from Node.js or TypeScript scripts.
This tool makes the most sense for developers working on codebases large enough that context loading becomes a bottleneck. If you are running Claude Code on a project with tens of thousands of lines across many files - a backend monorepo, a large library, a legacy codebase you are modernizing - the ability to search semantically rather than load blindly is a genuine workflow improvement.
It also suits teams with strict API cost controls. Loading a 500-file directory into every Claude Code session burns tokens quickly. An indexed search that retrieves only 5-10 relevant snippets per query is a more economical pattern for sustained use.
Developers who have adopted MCP-first workflows will find this a natural fit. If you already manage a ~/.claude/mcp.json or use claude mcp add for other tools, adding one more server has near-zero friction.
If you are on a small codebase (a few thousand lines), the overhead of setting up Zilliz Cloud and waiting for initial indexing probably is not worth it. Plain --context-dir or targeted file reads will serve you fine.
MCP servers are a core thread in the Developer's Digest coverage. The MCP Config Generator at mcp.developersdigest.tech helps you assemble valid MCP JSON configurations for Claude Code and Cursor, and claude-context slots in cleanly as an entry there - you paste your Zilliz and OpenAI credentials, and the generator produces the right JSON block.
The broader story here also connects to the skills directory at skills.developersdigest.tech. Context retrieval and agent skills are complementary: skills tell Claude Code how to approach a task, while an MCP server like claude-context controls what code Claude sees when doing it. Pairing a well-scoped skill - say, a TDD skill that drives a red-green-refactor loop - with a semantic search layer that retrieves the right tests and implementations is closer to how productive AI-native workflows actually look in 2026.
If you are building your own MCP server or extending claude-context for a specific use case, the hooks infrastructure covered at hooks.developersdigest.tech gives you the automation layer to trigger re-indexing on file saves or post-commit, keeping the search index fresh without manual intervention.
The strengths are real. AST-aware chunking is a better engineering choice than naive splitting. Hybrid BM25 plus dense retrieval is the right algorithm for code search. Incremental indexing with Merkle trees shows the team thought about day-two usage, not just the demo. The npm package installs without a global binary, and the four-tool API surface is small enough to understand quickly.
The limitations are also real. You must maintain an external vector database. Zilliz Cloud has a free tier, but it is still an external dependency with its own reliability surface. The default embedding model is OpenAI's text-embedding-3-small, which adds another API key and cost bucket to your setup. The 40% token reduction claim is self-reported and likely varies significantly by codebase structure and query patterns.
It is also worth noting that this is a first-party product from Zilliz, the company that makes the underlying database. That alignment of incentives is not a disqualifier, but it is context: the project is both a useful tool and a distribution channel for Zilliz Cloud sign-ups.
For teams already invested in the Milvus ecosystem or already paying for Zilliz Cloud, this is a low-friction addition. For everyone else, the external dependencies are a genuine evaluation criterion.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
AI coding platform built for large, complex codebases. Context Engine indexes 500K+ files across repos with 100ms retrie...
View ToolAI coding assistant with deep codebase context. Indexes your entire repo graph for accurate answers. VS Code and JetBrai...
View ToolTypeScript-first AI agent framework. Agents, tools, memory, workflows, RAG, evals, tracing, MCP, and production deployme...
View ToolLLM data framework for connecting custom data sources to language models. Best-in-class RAG, data connectors, and query...
View ToolDeferred tool loading reduces context overhead for large MCP suites.
Claude CodeConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsA complete, citation-backed Claude Code course with setup, prompting systems, MCP, CI, security, cost controls, and capstone workflows.
ai-developmentCodeGraph builds a local SQLite index of your codebase so Claude Code, Cursor, and Codex CLI spend far fewer tokens expl...
CodeGraph hit 7,800+ stars with 1,900 added in a single day - a local MCP knowledge graph that lets Claude Code explore...
agentmemory is a self-hosted MCP server that gives Claude Code, Cursor, and Gemini CLI searchable long-term memory acros...
agentmemory gives AI coding agents a persistent brain - capturing session context automatically via 12 Claude Code hooks...
Ruflo crossed 37,700 GitHub stars this week, adding nearly 1,900 in a single day. It turns Claude Code into a coordinate...
Zilliz's claude-context MCP lets Claude Code search your entire codebase semantically without loading every file - reach...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.