TL;DR
Zilliz's claude-context MCP lets Claude Code search your entire codebase semantically without loading every file - reaching 9,500 stars with a 3,300-star week.
Read next
zilliztech/claude-context is an MCP server that indexes your entire codebase with hybrid vector search, letting Claude Code find relevant code without loading whole directories. It hit 8.8k stars and is trending on both daily and weekly GitHub charts.
6 min readZilliz's claude-context MCP server lets Claude Code search millions of lines semantically instead of loading entire directories - cutting token usage by roughly 40%.
6 min readzilliztech/claude-context landed on GitHub's daily trending list with 873 new stars today - here's what this Claude Code MCP actually does and whether it's worth the setup.
5 min readzilliztech/claude-context crossed 9,500 GitHub stars this week, picking up roughly 3,300 new stars in seven days. That velocity is a clear signal: developers working with large codebases and AI coding assistants have hit a real wall, and this project addresses it directly.
The problem is straightforward. When you ask Claude Code to help you in a large monorepo - one with hundreds of thousands of lines of code across dozens of services - the agent has to decide what to load into context. Load too little and it misses relevant code. Load entire directories and you burn through tokens fast, slow down responses, and risk hitting context limits. Neither option is great.
Claude Context takes a different path. Instead of choosing between "load nothing" and "load everything," it indexes your codebase into a vector database and retrieves only the sections that are semantically relevant to each query. According to the project's own evaluation, this approach achieves approximately 40% token reduction compared to full-directory loading while maintaining equivalent retrieval quality. That is a meaningful number for teams running agents against large repos every day.
Claude Context is a Model Context Protocol (MCP) server built by Zilliz, the company behind the Milvus vector database. It implements what the team calls "hybrid code search" - a combination of BM25 keyword matching and dense vector similarity search.
BM25 is the classical ranking function used in full-text search engines like Elasticsearch. Dense vector search finds code that is semantically similar even when the exact keywords differ. Combining both gives you retrieval that is accurate for both literal lookups and conceptual queries.
The indexing pipeline does more than chunk files by line count. It parses source code using Abstract Syntax Trees (AST) before splitting, which means function boundaries, class definitions, and logical blocks stay intact. When Claude queries the index with "find functions that handle user authentication," it gets back whole, coherent chunks of code - not fragments cut in the middle of a method.
The server exposes four tools to Claude:
index_codebase - crawl and index a directorysearch_code - run a natural language query against the indexget_indexing_status - check indexing progressclear_index - remove the index for a specific codebaseIndexing is incremental. The project uses Merkle trees to track file changes, so re-indexing after edits only processes files that have actually changed. For large codebases this makes a material difference.
Supported languages include TypeScript, JavaScript, Python, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, and Markdown - covering most production codebases.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 25, 2026 • 5 min read
Apr 23, 2026 • 7 min read
Apr 23, 2026 • 6 min read
Apr 23, 2026 • 9 min read
There are two hard prerequisites before anything else: a Node.js version between 20.0.0 and 23.x (the project explicitly does not support 24.0.0 or newer), an OpenAI API key for generating embeddings, and a Zilliz Cloud account for the vector store.
Check your Node.js version first:
node --version
If you are on Node 24, downgrade via nvm before proceeding.
To add it as an MCP server in Claude Code:
claude mcp add claude-context \
-e OPENAI_API_KEY=sk-your-openai-api-key \
-e MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint \
-e MILVUS_TOKEN=your-zilliz-cloud-api-key \
-- npx @zilliz/claude-context-mcp@latest
Once added, open Claude Code in your project directory and run:
> Index this codebase
> Check the indexing status
> Find functions that handle database connections
For Cursor, add this to ~/.cursor/mcp.json:
{
"mcpServers": {
"claude-context": {
"command": "npx",
"args": ["-y", "@zilliz/claude-context-mcp@latest"],
"env": {
"OPENAI_API_KEY": "your-openai-api-key",
"MILVUS_ADDRESS": "your-zilliz-cloud-public-endpoint",
"MILVUS_TOKEN": "your-zilliz-cloud-api-key"
}
}
}
}
The same JSON structure works for Claude Desktop, Windsurf, VS Code, Cline, and Roo Code with minimal adjustments.
The sweet spot for Claude Context is teams or solo developers who regularly run AI agents against codebases with more than a few hundred files. If your project is small enough that loading relevant files manually is fast, you probably don't need this yet. But once you're working across services, dealing with unfamiliar legacy code, or running agents that need broad codebase knowledge without burning tokens - this fills a real gap.
Specifically:
Large monorepo teams - agents working across multiple services can query the shared index rather than having each session rediscover the same code.
Developers onboarding to unfamiliar codebases - natural language queries like "find where API rate limiting is implemented" are faster than grepping when you don't know the naming conventions.
Agent pipeline builders - if you're composing multi-step agents that need to fetch code as part of a reasoning loop, having a token-efficient retrieval layer matters at scale.
Open-source contributors - indexing a large repo before diving into a contribution saves the manual "where is X" exploration.
MCP servers are a growing part of how developers extend Claude Code, and Claude Context fits squarely into that pattern. If you're exploring what MCP servers are available and how to configure them, mcp.developersdigest.tech is a curated directory worth bookmarking - it covers servers across categories including code, databases, and productivity tooling.
Claude Context is also a concrete example of the broader pattern: vector databases moving from infrastructure that powers AI products to infrastructure that powers AI agents directly. Zilliz is using their own Milvus stack here, which means Claude Context doubles as a demonstration of what's possible when you wire a production-grade vector database into an MCP server.
For developers already using Claude Code hooks and skills to customize their AI workflows, adding an MCP server for semantic search is a natural next step - it complements the skills and hooks layers by improving the quality of retrieval rather than changing agent behavior directly.
The token reduction claim (40%) comes from the project's own benchmarks - independent verification would be useful before treating it as a universal number. Real-world gains will vary based on how you write queries, how your codebase is structured, and how much Claude would have loaded anyway.
The dependency on Zilliz Cloud is the main friction point. Milvus can be run locally, but the quick-start documentation pushes you toward their hosted product. Developers who prefer fully local setups will need to do extra configuration work. The Node.js version restriction (no 24.x) is also a real constraint if your environment is already on the latest LTS.
On the positive side: AST-based chunking is meaningfully better than naive line splitting for code retrieval. The incremental indexing via Merkle trees is a thoughtful engineering choice that shows the team has thought about day-to-day developer workflow, not just the demo case. And the hybrid BM25 + dense search is the right architecture for code search, where both exact keyword matches and semantic similarity matter.
If you're already paying for OpenAI embeddings and can spin up a Zilliz Cloud instance, this is a well-built tool worth adding to your Claude Code setup.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Interactive TUI dashboard that shows exactly where your Claude Code and Cursor tokens are going, in real time.
View ToolAnthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...
View ToolAnthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolHigh-performance code editor built in Rust with native AI integration. Sub-millisecond input latency. Built-in assistant...
View ToolInspect Claude Code transcripts to see which files, tools, and tokens are filling the context window.
View AppEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View App2.5x faster Opus at a higher token cost (research preview).
Claude CodeDeferred tool loading reduces context overhead for large MCP suites.
Claude CodeConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI Agents
Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...

Composio: Connect AI Agents to 1,000+ Apps via CLI (Gmail, Google Docs/Sheets, Hacker News Workflows) Check out Composio here: http://dashboard.composio.dev/?utm_source=Youtube&utm_channel=0426&utm_...

Anthropic has released Channels for Claude Code, enabling external events (CI alerts, production errors, PR comments, Discord/Telegram messages, webhooks, cron jobs, logs, and monitoring signals) to b...
CodeGraph builds a local SQLite index of your codebase so Claude Code, Cursor, and Codex CLI spend far fewer tokens expl...
CodeGraph hit 7,800+ stars with 1,900 added in a single day - a local MCP knowledge graph that lets Claude Code explore...
agentmemory is a self-hosted MCP server that gives Claude Code, Cursor, and Gemini CLI searchable long-term memory acros...
agentmemory gives AI coding agents a persistent brain - capturing session context automatically via 12 Claude Code hooks...
Ruflo crossed 37,700 GitHub stars this week, adding nearly 1,900 in a single day. It turns Claude Code into a coordinate...
Zilliz's claude-context MCP server lets Claude Code search millions of lines semantically instead of loading entire dire...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.