TL;DR
zilliztech/claude-context adds semantic code search to Claude Code via MCP, letting you query your entire codebase in natural language while cutting token usage by ~40%.
Read next
zilliztech/claude-context is an MCP server that indexes your entire codebase with hybrid vector search, letting Claude Code find relevant code without loading whole directories. It hit 8.8k stars and is trending on both daily and weekly GitHub charts.
6 min readzilliztech/claude-context landed on GitHub's daily trending list with 873 new stars today - here's what this Claude Code MCP actually does and whether it's worth the setup.
5 min readagentmemory gives AI coding agents a persistent brain - capturing session context automatically via 12 Claude Code hooks and 51 MCP tools, with 95.2% retrieval accuracy and 92% token savings over context-pasting.
7 min readzilliztech/claude-context hit 8,100 stars on GitHub this week and added over 1,000 stars in a single day - making it one of the fastest-rising repositories in the Claude Code ecosystem right now. The surge is not hard to explain: it solves a real, daily frustration for anyone using Claude Code on a codebase larger than a handful of files.
The core problem is context limits. Claude Code is powerful, but feeding it whole directories burns tokens quickly and bogs down responses. Most teams work around this by carefully curating what they paste in, which defeats the purpose of having an AI assistant at all. Claude Context approaches the problem from a different angle - index the codebase once into a vector database, then let the agent retrieve only the relevant chunks on demand. The pitch is a 40% reduction in token usage with equivalent retrieval quality.
Claude Context is a Model Context Protocol (MCP) server built on top of Zilliz Cloud, the managed version of the open-source Milvus vector database. Once installed, it exposes four tools to any MCP-compatible agent:
index_codebase - scans a directory, splits code using AST-based chunking, embeds each chunk, and stores it in a vector collectionsearch_code - runs hybrid search (BM25 full-text plus dense vector embeddings) against the indexed codebase using a natural language queryget_indexing_status - returns progress percentage and completion status while indexing runsclear_index - removes a stored index for a given codebaseThe hybrid search is the meaningful technical detail here. Pure vector search struggles with exact identifier names - function names, variable names, file paths. BM25 handles those well but misses semantic intent. Combining both gives you the best of each approach, and it is the same strategy used by production enterprise search systems.
Indexing uses Merkle trees for incremental updates. After the initial pass, re-indexing only touches files that changed - so you do not pay the full embedding cost every time you modify a file.
Language support covers TypeScript, JavaScript, Python, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, and Markdown. The AST chunker has a fallback to LangChain character-based splitting for languages it does not natively understand.
Beyond Claude Code, the same MCP server works with Cursor, VS Code, Windsurf, Gemini CLI, Cline, and a dozen other AI coding environments. The configuration format differs per platform but the underlying server is the same.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 23, 2026 • 9 min read
Apr 22, 2026 • 7 min read
Apr 22, 2026 • 7 min read
Apr 22, 2026 • 8 min read
You need three things before you start: a free Zilliz Cloud account (for the vector database endpoint and API key), an OpenAI API key (for the text-embedding-3-small model used by default), and Node.js between version 20.0.0 and 24.0.0.
For Claude Code, a single command wires everything up:
claude mcp add claude-context \
-e OPENAI_API_KEY=sk-your-openai-api-key \
-e MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint \
-e MILVUS_TOKEN=your-zilliz-cloud-api-key \
-- npx @zilliz/claude-context-mcp@latest
After that, open Claude Code in your project and run:
Index this codebase
Claude will call index_codebase on the current directory. On a large repo the first index takes a few minutes. You can check progress with:
Check the indexing status
Once indexing is done, you can query in plain English:
Find functions that handle user authentication
The MCP server returns ranked, contextually relevant code snippets back to Claude, which uses them to answer your question without loading unrelated files.
For teams who want programmatic control, the @zilliz/claude-context-core package exposes the same engine directly:
import { Context, MilvusVectorDatabase, OpenAIEmbedding } from '@zilliz/claude-context-core';
const context = new Context({
embedding: new OpenAIEmbedding({ apiKey: process.env.OPENAI_API_KEY }),
vectorDatabase: new MilvusVectorDatabase({
address: process.env.MILVUS_ADDRESS,
token: process.env.MILVUS_TOKEN
})
});
const stats = await context.indexCodebase('./your-project');
const results = await context.semanticSearch('./your-project', 'query', 5);
If you are a solo developer working on a project with more than 20,000 lines of code, Claude Context is worth trying. The pain of manually specifying context files adds up fast, and the 40% token reduction translates directly to lower API costs.
For teams using Claude Code collaboratively, the shared vector index means everyone on the team can do semantic search against the same indexed snapshot. That is a genuine workflow improvement over each engineer managing their own context files.
The repo is also a good fit for anyone already in the Milvus or Zilliz ecosystem. If you are storing embeddings for other purposes (RAG pipelines, semantic search for docs, etc.), you are already paying for the infrastructure - adding code search on top is low marginal cost.
Developers building MCP servers or AI agent integrations will find the source code instructive. The monorepo structure - core library, MCP server, VS Code extension - is a clean example of how to package a single semantic search capability for multiple consumers.
MCP is one of the most important emerging standards for AI-native tooling, and Claude Context is a good example of what a well-scoped MCP server looks like. It does one thing - semantic code search - and it does it well across every major AI coding environment.
If you use Claude Code and want to explore what else is available in the MCP ecosystem, the DevDigest MCP directory at mcp.developersdigest.tech catalogs tools across categories including code intelligence, data access, and developer productivity. Claude Context fits squarely in the code intelligence category.
The incremental indexing approach (Merkle trees, AST chunking, hybrid retrieval) also previews techniques you will see in more AI-native coding workflows going forward - not just for search but for memory, context management, and agent continuity. Following repos like this one is a good way to stay ahead of where the tooling is heading.
The setup has real friction. You need a Zilliz Cloud account and an OpenAI API key before you can run anything. For developers who are already paying for OpenAI embeddings and want a managed vector DB, the stack is familiar. For everyone else, standing up two new external services adds friction that the project does not yet paper over with a simpler local option.
Ollama is listed as a supported embedding provider, which suggests a local-only path is possible, but the documentation focuses on the cloud setup. If you want to run everything locally without external API keys, you will need to dig into the configuration yourself.
With 75 open issues and 41 open pull requests against 165 total commits, the project is active but still maturing. Expect rough edges on less-common languages and file structures.
The 40% token reduction figure comes from a controlled evaluation described in the README but without a published benchmark methodology. Take the number as a directional signal rather than a guaranteed outcome for your specific codebase.
For teams with large codebases and active Claude Code workflows, the tradeoffs are favorable. For smaller projects or teams averse to new service dependencies, the manual context approach may still be simpler.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolInteractive TUI dashboard that shows exactly where your Claude Code and Cursor tokens are going, in real time.
View ToolAnthropic's flagship reasoning model. Best-in-class for coding, long-context analysis, and agentic workflows. 1M token c...
View ToolHigh-performance code editor built in Rust with native AI integration. Sub-millisecond input latency. Built-in assistant...
View ToolEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppTurn a one-liner into a working Claude Code skill. From idea to installed in a minute.
View AppUnlock pro skills and share private collections with your team.
View App
Nimbalyst Demo: A Visual Workspace for Codex + Claude Code with Kanban, Plans, and AI Commits Try it: https://nimbalyst.com/ Star Repo Here: https://github.com/Nimbalyst/nimbalyst This video demos N...

Composio: Connect AI Agents to 1,000+ Apps via CLI (Gmail, Google Docs/Sheets, Hacker News Workflows) Check out Composio here: http://dashboard.composio.dev/?utm_source=Youtube&utm_channel=0426&utm_...

Anthropic has released Channels for Claude Code, enabling external events (CI alerts, production errors, PR comments, Discord/Telegram messages, webhooks, cron jobs, logs, and monitoring signals) to b...
Anthropic just shipped an official curated plugin directory for Claude Code. It earned 2,500+ stars in a single day and...
CodeGraph builds a local SQLite index of your codebase so Claude Code, Cursor, and Codex CLI spend far fewer tokens expl...
CodeGraph hit 7,800+ stars with 1,900 added in a single day - a local MCP knowledge graph that lets Claude Code explore...
agentmemory is a self-hosted MCP server that gives Claude Code, Cursor, and Gemini CLI searchable long-term memory acros...
agentmemory gives AI coding agents a persistent brain - capturing session context automatically via 12 Claude Code hooks...
Ruflo is an open-source multi-agent orchestration platform built specifically for Claude, shipping 100+ specialized agen...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.