Developers Digest

Developers Digest - Guides https://www.developersdigest.tech/guides Step-by-step guides for building AI-powered applications. Claude Code setup, MCP servers, and more. en Thu, 23 Apr 2026 14:11:07 GMT https://avatars.githubusercontent.com/u/124798203?v=4 Developers Digest - Guides https://www.developersdigest.tech/guides <![CDATA[Writing Your First Claude Code Skill]]> https://www.developersdigest.tech/guides/writing-your-first-claude-code-skill https://www.developersdigest.tech/guides/writing-your-first-claude-code-skill /SKILL.md` and Claude auto-loads it when the trigger matches the current request. Think of it as a playbook you hand to a good engineer who is about to start the task. The engineer is smart, so you do not need to teach them how to write code. You do need to tell them the shape of the task, the constraints that apply, the gotchas you have already discovered, and what "done" looks like. That is the skill. By the end of this guide you will have written your first skill, installed it, triggered it from a real prompt, and understood the design decisions that make the difference between a skill that Claude actually uses and one that sits forgotten. ## The four parts of every skill Every skill is a markdown file with YAML frontmatter. Four things matter: 1. **Name** - what the skill is called 2. **Description** - when Claude should trigger it 3. **Allowed tools** - which tools the skill expects Claude to have access to 4. **Body** - the actual instructions for doing the task The frontmatter controls when the skill fires. The body controls what happens when it does. Getting both right is the whole craft. ## Pick a task worth turning into a skill The first question is which tasks deserve a skill. The criteria: - **You do it more than once.** A skill is a tax on every session's context load. Writing one for a task you do annually is not worth it. A task you do weekly absolutely is. - **It has a consistent shape.** Skills encode patterns. If every instance of the task is wildly different, there is no pattern to encode. - **You have an opinion about how it should be done.** Without an opinion, the skill is just a nudge toward "do a good job," which is not a skill. With an opinion, the skill becomes a style enforcer. - **It has failure modes you want to prevent.** Skills are where you capture the lessons learned from mistakes. If a task has surprised you in the past, write the skill so future-you does not step in the same hole. A bad first skill: "write code." Too vague. No opinion. No failure modes. A good first skill: "add a new HTTP route to our Express API." Specific shape, conventions worth encoding (route handler file, input validation, error pattern, test file), known failure modes. ## Write the frontmatter Let's walk through a concrete skill. Task: adding a new blog post to a Next.js content-markdown repo. ```yaml --- name: add-blog-post description: | Trigger when the user asks to add a blog post, write a new article, or publish a piece of content. Phrases: "add blog post", "write article", "new post", "publish", "add to blog". allowed-tools: - Read - Write - Edit - Glob --- ``` The `description` is the most important field. It is what Claude reads when deciding whether to load your skill. Lead with the trigger condition ("Trigger when..."). Follow with example phrases. The common mistake is writing a description that sounds like marketing copy. Skills do not need to sell themselves. They need to be findable by pattern matching. The phrases in the description are literal matches Claude uses to route the request to your skill. `allowed-tools` is the declaration of what tools the skill expects to use. It is not a permission system, it is documentation. The skill author is telling the user "if your Claude Code config does not allow these, this skill will not work." ## Write the body The body is a second-person playbook. You are writing instructions to another engineer. Start with prerequisites. What has to be true before the skill can run? For the blog post example: ```markdown ## Prerequisites - Markdown content directory identified (e.g., `content/blog/`) - Frontmatter shape known (check an existing post for reference) - Hero image directory identified if the site uses featured images ``` Continue with steps. Use numbered, imperative instructions. Each step should be one action the agent can verify. ```markdown ## Steps 1. **Find the content directory.** Glob `content/**/blog*` or check an existing post. Confirm with the user if there are multiple candidates. 2. **Read an existing post.** Pick the most recent post and study its frontmatter shape. Blog schemas vary: date format, tags array, relatedPosts, series fields. Do not guess. 3. **Create the new post file.** Use the slug as the filename: `content/blog/.md`. Never overwrite an existing file. 4. **Write the frontmatter.** Copy the shape from step 2. Fields that always matter: - title (sentence case, under 70 chars) - slug (kebab-case, matches filename) - excerpt (one sentence, not a rewrite of the title) - date (ISO) - tags (array, match existing site conventions) 5. **Write the body.** Open with a lead paragraph, not a heading. Use `## H2` for sections, not `#`. Short paragraphs. No em dashes. 6. **Verify frontmatter parses.** Run the site's build or lint step if available. If not, grep for any broken frontmatter on existing posts and match that pattern exactly. 7. **Tell the user what was added** and where. Include the slug and the file path. ``` The pattern: prerequisites, numbered steps, each step both specific and verifiable. No fluff. ## Write the failure-mode section Every skill that ships should have a "common mistakes" or "anti-patterns" section. This is where you bake in the lessons from past failures. ```markdown ## Common mistakes to avoid - Do not use em dashes. Use regular dashes with spaces. - Do not guess at frontmatter fields. Read an existing post. - Do not write a post without a date field. Sort order depends on it. - Do not overwrite existing posts. If the slug collides, pick a new slug. ``` This is the highest-leverage section of the skill. Every item here is a mistake you, or Claude, or a prior session made before. Future sessions save the time of discovering it again. ## Write the output section End the skill with a specification of what success looks like. ```markdown ## Output - New markdown file at `content/blog/.md` - Frontmatter matches existing post shape - File passes the site's build or frontmatter validator - User receives the slug and file path in the response ``` This section is both a contract for Claude and a checklist for you when reviewing the skill's output. ## Install the skill Save the file as `~/.claude/skills/add-blog-post/SKILL.md`. Claude Code auto-discovers skills under `~/.claude/skills/*/SKILL.md` at session start. To test it, start a new Claude Code session (or `/reload` in an existing one) and send a prompt that matches the description: > "Add a blog post about our Q2 release" Claude should load the skill, confirm the prerequisites, and follow the steps. If it does not load the skill, the description's trigger phrases probably did not match. Revise the description and retry. ## Test with real prompts Test a skill with at least three prompts before trusting it: 1. The exact phrase from the description ("add a blog post") 2. A paraphrase the user might actually say ("can you write a new post about X") 3. An edge case (vague or partial request, e.g. "post about the new feature") For each prompt, check that the skill loads, that Claude follows the steps in order, and that the output matches the spec. If any of these fail, the skill is not ready. ## Iterate from failure The first version of any skill is wrong. You will see Claude skip a step, misinterpret a prerequisite, or invent a variation of the pattern. That is fine. The skill file is a living document. Every failure is a note to add to the "common mistakes" section or a clarification to add to the relevant step. The most valuable skills in my library are on their fifth or sixth revision. Each revision captures a specific failure mode from real use. ## Keep the skill short Skills that exceed 500 lines are usually too long. They are trying to teach Claude too many things at once and the trigger becomes muddy. When a skill grows past that threshold, split it. A useful heuristic: if you could not verbally explain the skill to a new engineer in three minutes, it is too big. ## The six-line litmus test Before you ship a skill, read the frontmatter and the first line of the body. If those seven lines answer "when should I use this?" and "what will happen if I do?", ship it. If they do not, rewrite before shipping. The most common skill failure is not that the body is bad but that the top of the file is vague. ## Where to go next - Browse real SKILL.md examples at [skills.developersdigest.tech](https://skills.developersdigest.tech) for patterns across categories. - Read our [context engineering guide](/blog/context-engineering-guide) for the broader theory behind skills, CLAUDE.md, and memory. - Write a second skill. The gap between zero skills and one is large; the gap between one and three is smaller. Fluency comes with volume. The skill file is the unit of encoded opinion in Claude Code. Every skill you write is a lesson future-you does not need to re-learn and every other teammate can benefit from. Write them carefully, but write them. ]]> Thu, 23 Apr 2026 00:00:00 GMT getting-started Guide Developers Digest <![CDATA[Migrating from Cursor to Claude Code]]> https://www.developersdigest.tech/guides/migrating-from-cursor-to-claude-code https://www.developersdigest.tech/guides/migrating-from-cursor-to-claude-code AI > Rules for AI` for global rules For each project with Cursor Rules, create a `CLAUDE.md` at the repo root. Copy over the content, then clean it up using Claude Code conventions: ```markdown # Project Name ## Stack - Framework, language, package manager ## Rules - Concrete rules, one per line, imperative voice - "Use pnpm, not npm" not "please prefer pnpm" - "No em dashes" not "avoid excessive punctuation" ## Commands - `pnpm dev` - local development - `pnpm test` - run tests - `pnpm lint` - typecheck and lint ## Architecture notes ``` For global rules, write a `~/.claude/CLAUDE.md`. Claude Code loads this every session. ## Step 3: Rebuild your snippet library Cursor has a snippets and prompt library feature. Claude Code equivalents: - **One-shot prompts** that you reuse - save to `~/.claude/prompts/.md`, reference with `@prompts/` - **Complex procedures** - write them as skills at `~/.claude/skills//SKILL.md` with clear trigger phrases - **Project-specific prompts** - save to `.claude/prompts/.md` at the repo root The skills system is the closest analog to Cursor's Composer templates, but it is more powerful: skills load automatically based on the trigger phrase in the description, rather than needing explicit invocation. ## Step 4: Keybindings and editor integration Claude Code is terminal-native, so your editor keybindings stay where they are. The integration flip is instead at the terminal layer: - **tmux users** are usually already happy. Claude Code runs in any pane. - **Alacritty / Kitty / Ghostty users** benefit from the fast redraw when Claude is streaming output - **Warp users** can use the AI blocks and still run Claude Code side-by-side If you previously used Cursor's shortcut for "new Composer", build the muscle memory for your terminal equivalent. Most people bind a tmux key to "new Claude Code session" within one week. ## Step 5: Learn the workflow changes Cursor's core flow is chat with visual diffs, then apply. Claude Code's core flow is agent-driven edit with preview-before-commit. Three habits to build: **Let the agent do more in one turn.** In Cursor you would often send five small prompts that together accomplish a task. In Claude Code, one well-scoped prompt accomplishes the same task with less back-and-forth. The agent is expected to make multiple related edits without asking. **Review the diff, not the intent.** Cursor trains you to approve each edit. Claude Code trains you to let a batch of edits happen and then review the resulting diff with `git diff` at the end. This is faster once you trust it. **Use worktrees for parallel work.** The `git worktree` feature becomes a first-class part of your workflow. You start a feature in a worktree, run Claude Code there, and the main branch stays untouched. This is what Zed's parallel-agents feature automates, but you can do it today with shell commands. ## Step 6: Set up the skills you actually need Start with three skills most developers reach for: 1. **feature-dev** - end-to-end feature building. Plan, implement, test, review. 2. **code-review** - run a review pass on uncommitted changes or a specific PR. 3. **commit-commands** - draft a commit message and run the commit. Browse the [skills marketplace](https://skills.developersdigest.tech) for patterns, but start by copying the SKILL.md of a skill close to your need and adapting it. The learning curve is fast - by the third skill you write, you will have internalized the format. ## Step 7: Wire MCP servers This is where Claude Code pulls ahead of Cursor for most developers. MCP servers give your agent access to external systems - GitHub, Slack, Linear, your database, a search engine. Cursor supports MCP now too, but the ecosystem is deeper and the tooling is more mature in Claude Code. Start with three MCP servers: - **context7** or similar docs server for up-to-date library documentation - **filesystem** server for explicit directory permissions - **github** server for PR management and issue triage Install via: ```bash claude mcp add context7 npx -y @upstash/context7-mcp ``` Browse more at [mcp.developersdigest.tech](https://mcp.developersdigest.tech). ## Step 8: The first week test After a week on Claude Code, ask: - Did I feel faster or slower on a typical feature? - Did I end up opening Cursor for specific tasks? Which ones? - Did my code review practice survive the agent doing multi-file edits? - Am I paying more or less than Cursor Pro? If you find yourself opening Cursor for visual inline diff review, that is fine. Many people settle into "Claude Code for autonomous work, Cursor for inline review" as a long-term hybrid. The real failure mode is opening Cursor because you do not trust Claude Code yet - that is a signal to either watch a tutorial on autonomous mode or to keep two-week trialing. ## Common gotchas - **You forgot to write CLAUDE.md.** Claude Code without CLAUDE.md is Claude Code without context. It will work, but it will feel dumber than Cursor felt with your Rules loaded. Write the CLAUDE.md on day one. - **You are still chatting in small turns.** The agent is designed for larger tasks per turn. Batch your requests. - **You are not using worktrees.** One of the biggest power moves is lost if you run everything on main. - **You expect visual inline diffs.** Claude Code produces full file edits you review after. Different mental model. Use `git diff` to see what changed. - **You skipped installing skills.** The out-of-box experience is thin. Install three to five skills on day one. ## When not to switch A few genuine reasons to stay on Cursor: - You spend most of your time on small inline edits with tight feedback loops - You rely heavily on Composer's visual file-selection UI - Your team's code review workflow is tightly integrated with Cursor's diff viewer - You are on a Windows machine and have never set up WSL These are real workflows. Claude Code is not strictly better for every developer. It is better for developers whose work has grown into agent-scale tasks, parallel work, and autonomous runs. ## Further reading - [Getting Started with Claude Code](/guides/claude-code-getting-started) - the basics, if you skipped them - [How to Write CLAUDE.md: The Complete Guide](/blog/how-to-write-claudemd-the-complete-guide) - deep dive on the key config file - [Writing Your First Claude Code Skill](/guides/writing-your-first-claude-code-skill) - build your own skill library The transition is real work. The payoff is usually real. If you are going to try, commit to two weeks of Claude-Code-first usage and then decide. ]]> Thu, 23 Apr 2026 00:00:00 GMT getting-started Guide Developers Digest <![CDATA[Chronicle Research Preview Setup Guide]]> https://www.developersdigest.tech/guides/chronicle-research-preview https://www.developersdigest.tech/guides/chronicle-research-preview Privacy & Security** and enable Codex under **Screen Recording** and **Accessibility**. If a permission is restricted by macOS or your organization, Chronicle will start once the restriction is removed and permissions are granted. ## Pause or disable Chronicle You can pause and resume Chronicle from the Codex menu bar icon at any time. - Use **Pause Chronicle** before meetings or when viewing sensitive content. - Use **Resume Chronicle** when you want context to be captured again. To disable it permanently: 1. Open **Settings > Personalization > Memories**. 2. Turn off **Chronicle**. You can also control whether memories are used on a per-thread basis from the memories settings docs. ## Rate limits Chronicle runs background agents that summarize captured screen images into memories, and these agents can consume rate limits quickly. ## Privacy and security Chronicle uses screen captures and does not have access to microphone or system audio. ### Where it stores data - Temporary screen captures appear under `$TMPDIR/chronicle/screen_recording/` while Chronicle is running. Frames older than 6 hours are deleted while running. - Generated memories are saved in markdown files under `$CODEX_HOME/memories_extensions/chronicle/` (usually `~/.codex/memories_extensions/chronicle`). Both locations can contain sensitive information. Do not share them with others. You can ask Codex to search these memories. If you want Codex to forget something, delete or edit the respective markdown file. ### What data gets shared with OpenAI Chronicle captures are processed locally first, then summarized by Codex using selected screenshot frames, OCR text, timing, and local file paths. Temporary screen captures are processed on OpenAI servers only for memory generation. OpenAI does not store screenshots after processing unless required by law, and does not use them for training. The generated memories stay local in `$CODEX_HOME/memories_extensions/chronicle/`. Relevant memory contents can be included as context in future sessions. ## Prompt injection risk Chronicle increases prompt injection risk from screen content. If you open a site with malicious instructions, Codex can be tricked into following them. Be cautious when running Chronicle in high-risk browsing environments. ## Troubleshooting ### I do not see the Chronicle setting 1. Confirm you are on a Codex app build that includes Chronicle. 2. Confirm **Memories** is enabled in **Settings > Personalization**. 3. Confirm Chronicle is available for your region and subscription tier. ### Setup does not complete 1. Confirm Codex has **Screen Recording** and **Accessibility** permissions. 2. Quit and reopen the Codex app. 3. Open **Settings > Personalization** and check Chronicle status. ### Which model is used for Chronicle memories Chronicle uses the same model as your other memories. If you did not set a specific model, it uses the default Codex model. To pin one, set `consolidation_model` in configuration. ```toml [memories] consolidation_model = "gpt-5.4-mini" ``` ]]> Tue, 21 Apr 2026 00:00:00 GMT getting-started Guide Developers Digest <![CDATA[Run AI Models Locally with Ollama and LM Studio]]> https://www.developersdigest.tech/guides/run-ai-models-locally https://www.developersdigest.tech/guides/run-ai-models-locally ` | Built-in search and download UI | | API | OpenAI-compatible at :11434 | OpenAI-compatible at :1234 | | OS support | macOS, Linux, Windows | macOS, Linux, Windows | | Resource usage | Lightweight daemon | Electron app, heavier footprint | | Custom models | Modelfile system | Import any GGUF file | Both tools are free. Most developers end up using Ollama for day-to-day coding workflows and LM Studio for model exploration and testing. You can run both side by side without conflicts since they use different ports. --- ## Part 1: Ollama (CLI-first) Ollama is the easiest way to run local models from the terminal. It handles model downloads, quantization, memory management, and provides both a CLI and an API server. ### Install Ollama **macOS:** ```bash # Install via Homebrew brew install ollama # Or download directly from ollama.com curl -fsSL https://ollama.com/install.sh | sh ``` After installation, Ollama runs as a background service automatically. You can verify it is running: ```bash ollama --version ``` **Linux:** ```bash curl -fsSL https://ollama.com/install.sh | sh ``` This installs Ollama and sets up a systemd service. The service starts automatically: ```bash # Check status systemctl status ollama # Start manually if needed systemctl start ollama ``` For NVIDIA GPU support, make sure you have the NVIDIA Container Toolkit or up-to-date CUDA drivers installed. Ollama detects your GPU automatically. **Windows:** Download the installer from [ollama.com/download](https://ollama.com/download). Run the `.exe` and follow the prompts. Ollama runs in the system tray. For WSL2 users, install the Linux version inside your WSL2 distro instead. This gives you better GPU passthrough and a more consistent development experience. ### Verify the installation ```bash # Should print the version number ollama --version # List downloaded models (empty on fresh install) ollama list # The API server runs on port 11434 by default curl http://localhost:11434/api/tags ``` ### Your first model: ollama run llama4 Pull and run a model. Llama 4 is Meta's latest open-weight model and a solid starting point. ```bash # Pull and start an interactive chat session ollama run llama4 ``` The first run downloads the model (this takes a few minutes depending on your connection). Subsequent runs start instantly since the model is cached locally. Once the model loads, you get an interactive prompt: ``` >>> What is the time complexity of quicksort? Quicksort has an average-case time complexity of O(n log n) and a worst-case time complexity of O(n^2). The worst case occurs when the pivot selection consistently picks the smallest or largest element, leading to unbalanced partitions... ``` Type `/bye` to exit the session. ### Useful Ollama commands ```bash # List all downloaded models ollama list # Pull a model without starting a chat ollama pull qwen3.5-coder:32b # Remove a model to free disk space ollama rm llama4 # Show model details (parameters, quantization, size) ollama show llama4 # Run with a system prompt ollama run llama4 --system "You are a senior Python developer. Be concise." # Pipe input from a file cat bug-report.txt | ollama run llama4 "Summarize this bug report in 3 bullet points" # Run the API server explicitly (usually auto-started) ollama serve ``` ### Creating custom models with Modelfile Ollama lets you create custom model configurations using a Modelfile. This is useful for baking in a system prompt, adjusting parameters, or layering fine-tuned weights. ```bash cat > Modelfile << 'HEREDOC' FROM qwen3.5-coder:32b SYSTEM "You are a senior full-stack developer. You write clean, well-tested TypeScript and Python. Be concise. Show code, not explanations." PARAMETER temperature 0.2 PARAMETER num_ctx 8192 HEREDOC ollama create my-coder -f Modelfile ollama run my-coder ``` Your custom model appears in `ollama list` and can be used anywhere you reference a model name - in API calls, tool integrations, and scripts. --- ## Part 2: LM Studio (GUI-first) LM Studio is a desktop application that lets you discover, download, and run local models through a visual interface. If you prefer clicking over typing, or you want a fast way to compare models side by side, LM Studio is the tool for you. ### Install LM Studio Download the installer for your platform from [lmstudio.ai](https://lmstudio.ai). - **macOS:** Download the `.dmg`, drag to Applications, and launch. - **Windows:** Download the `.exe` installer and run it. - **Linux:** Download the `.AppImage`, make it executable with `chmod +x`, and run it. LM Studio requires no additional dependencies. It bundles its own inference engine (based on llama.cpp) and handles GPU detection automatically. ### The LM Studio interface When you open LM Studio, you see four main sections: 1. **Discover** - Browse and search the Hugging Face model catalog directly from the app. Filter by size, quantization, architecture, and popularity. Click download on any GGUF model to pull it locally. 2. **Chat** - An interactive chat interface where you pick a model from your local library and start a conversation. You can adjust temperature, max tokens, system prompt, and other parameters in real time from the sidebar. 3. **My Models** - Your local model library. Shows all downloaded models with size, quantization level, and last-used date. You can delete models from here to reclaim disk space. 4. **Developer** - The local API server. Toggle it on to expose an OpenAI-compatible API endpoint at `http://localhost:1234/v1`. Any tool or script that works with the OpenAI API can point at this endpoint. ### Downloading your first model 1. Open the **Discover** tab 2. Search for "qwen3.5-coder" or "llama 4" 3. You will see multiple versions of each model - look for GGUF files with Q4_K_M quantization as a good starting point 4. Click the download button next to the version you want 5. Wait for the download to complete (progress bar shows in the app) LM Studio stores models in `~/.cache/lm-studio/models/` on macOS and Linux, and `C:\Users\\.cache\lm-studio\models\` on Windows. ### Running a model in chat 1. Go to the **Chat** tab 2. Click the model selector dropdown at the top 3. Pick a downloaded model 4. Wait a few seconds for it to load into memory 5. Type your message and press Enter The sidebar lets you adjust these parameters on the fly: - **Temperature** - Controls randomness. Use 0.1-0.3 for code, 0.7-1.0 for creative text. - **Max tokens** - Maximum response length. Set higher for long code generation. - **System prompt** - Instructions that apply to the whole conversation. - **Context length** - How much previous conversation the model can see. Higher values use more RAM. - **GPU offload** - How many layers to run on GPU vs CPU. More GPU layers means faster inference. ### Starting the local API server The real power of LM Studio for developers is its local API server. 1. Go to the **Developer** tab 2. Select a model to serve 3. Click **Start Server** 4. The server starts at `http://localhost:1234/v1` You can now call it from any tool or script using the OpenAI API format: ```bash curl http://localhost:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "local-model", "messages": [ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "Write a TypeScript function that debounces another function"} ], "temperature": 0.2 }' ``` Python example: ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:1234/v1", api_key="lm-studio", # required by the library but not checked ) response = client.chat.completions.create( model="local-model", messages=[ {"role": "system", "content": "You are a senior TypeScript developer."}, {"role": "user", "content": "Explain the builder pattern with an example"}, ], temperature=0.3, ) print(response.choices[0].message.content) ``` Note: The model name in API calls can be anything when using LM Studio - it routes to whichever model you have loaded in the Developer tab. Some setups use `"local-model"` as a convention. ### Comparing models side by side One of LM Studio's standout features is the ability to load two models and compare their responses to the same prompt. This is invaluable when deciding which model to use for a specific task. 1. In the Chat tab, click the "+" button to create a new chat 2. Load a different model in this tab 3. Send the same prompt to both 4. Compare quality, speed, and token usage This visual comparison is something Ollama cannot do without custom scripting. --- ## Best models for coding Not all models are created equal for programming tasks. Here are the top choices for code generation, completion, and refactoring as of April 2026. ### Qwen 3.5 Coder The current leader for local code generation. Available in multiple sizes to fit your hardware. ```bash # 32B parameters - best quality, needs 20GB+ VRAM ollama run qwen3.5-coder:32b # 14B - great balance of quality and speed ollama run qwen3.5-coder:14b # 7B - fast, works on 8GB VRAM ollama run qwen3.5-coder:7b ``` Qwen 3.5 Coder excels at: - Multi-file code generation - Understanding complex codebases - TypeScript, Python, Rust, and Go - Following coding conventions from context ### DeepSeek Coder V3 Strong at code reasoning and multi-step problem solving. Particularly good at debugging. ```bash # 33B - full quality ollama run deepseek-coder-v3:33b # 7B - lightweight option ollama run deepseek-coder-v3:7b ``` Best for: - Debugging and error analysis - Algorithm implementation - Code review and suggestions - Mathematical and logical reasoning in code ### CodeLlama Meta's code-specialized Llama variant. Mature, well-tested, and widely supported by tools. ```bash # 34B - best quality ollama run codellama:34b # 13B - good middle ground ollama run codellama:13b # 7B - lightweight ollama run codellama:7b ``` Best for: - Code infilling (fill-in-the-middle) - Large context windows (up to 100K tokens) - Broad language support - Integration with older tooling that expects CodeLlama ### Quick comparison for coding models | Model | Size | VRAM Needed | Speed | Code Quality | |-------|------|-------------|-------|-------------| | Qwen 3.5 Coder 32B | 18GB | 24GB | Medium | Excellent | | Qwen 3.5 Coder 14B | 8GB | 12GB | Fast | Very Good | | DeepSeek Coder V3 33B | 19GB | 24GB | Medium | Excellent | | DeepSeek Coder V3 7B | 4GB | 8GB | Very Fast | Good | | CodeLlama 34B | 19GB | 24GB | Medium | Very Good | | CodeLlama 7B | 4GB | 8GB | Very Fast | Decent | ## Best models for general use For chat, writing, summarization, and general reasoning tasks, these models lead the pack. ### Llama 4 Meta's flagship open model. Strong across the board for general tasks. ```bash # Scout variant - lighter, faster ollama run llama4 # Maverick variant - larger, more capable ollama run llama4:maverick ``` ### Mistral Mistral's models punch well above their weight class. Excellent efficiency-to-quality ratio. ```bash # Mistral Large - top quality ollama run mistral-large # Mistral Small - fast and capable ollama run mistral-small # Mistral 7B - lightweight classic ollama run mistral:7b ``` ### Phi-4 Microsoft's compact model series. Surprisingly capable for its size. ```bash # Phi-4 14B - best in class for its size ollama run phi4:14b ``` ### Quick comparison for general models | Model | Size | VRAM Needed | Speed | Quality | |-------|------|-------------|-------|---------| | Llama 4 Scout | 15GB | 20GB | Medium | Excellent | | Llama 4 Maverick | 25GB | 32GB | Slow | Outstanding | | Mistral Large | 22GB | 28GB | Medium | Excellent | | Mistral Small | 8GB | 12GB | Fast | Very Good | | Phi-4 14B | 8GB | 10GB | Fast | Very Good | ## Using local models with AI coding tools The real power of local models comes from integrating them into your existing development workflow. ### Claude Code Claude Code can use local models as a backend through the OpenAI-compatible API that Ollama provides. ```bash # Set the environment variables to point at your local Ollama export OPENAI_API_BASE=http://localhost:11434/v1 export OPENAI_API_KEY=ollama ``` Or point at LM Studio: ```bash export OPENAI_API_BASE=http://localhost:1234/v1 export OPENAI_API_KEY=lm-studio ``` You can also configure a model alias in your shell profile: ```bash # Add to ~/.zshrc or ~/.bashrc alias claude-local='OPENAI_API_BASE=http://localhost:11434/v1 claude' ``` ### Cursor Cursor has built-in support for local models. 1. Open Cursor Settings (Cmd+Shift+P on macOS, Ctrl+Shift+P on Linux/Windows) 2. Navigate to **Models** > **Model Provider** 3. Select **Ollama** as the provider 4. Choose your model from the dropdown (Cursor auto-detects running models) Alternatively, configure it in `~/.cursor/settings.json`: ```json { "ai.provider": "ollama", "ai.model": "qwen3.5-coder:32b", "ai.endpoint": "http://localhost:11434" } ``` For LM Studio, set the provider to "OpenAI Compatible" and point at `http://localhost:1234/v1`. ### Continue.dev Continue is an open-source AI coding assistant that runs in VS Code and JetBrains. It has excellent local model support. Install the Continue extension, then edit `~/.continue/config.yaml`: ```yaml models: - title: "Qwen 3.5 Coder 32B" provider: ollama model: qwen3.5-coder:32b apiBase: http://localhost:11434 - title: "LM Studio Model" provider: lmstudio model: local-model apiBase: http://localhost:1234 tabAutocompleteModel: title: "Qwen Coder 7B" provider: ollama model: qwen3.5-coder:7b apiBase: http://localhost:11434 ``` This gives you a full local AI coding setup: the 32B model for chat and generation, and the fast 7B model for tab autocomplete. ### Using the API directly Both Ollama and LM Studio expose OpenAI-compatible REST APIs. You can call them from any language or tool. **Ollama (port 11434):** ```bash curl http://localhost:11434/v1/chat/completions -d '{ "model": "qwen3.5-coder:32b", "messages": [ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "Explain async/await in JavaScript"} ] }' ``` **LM Studio (port 1234):** ```bash curl http://localhost:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "local-model", "messages": [ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "Explain async/await in JavaScript"} ] }' ``` Python example using the `openai` library (works with either backend): ```python from openai import OpenAI # For Ollama client = OpenAI( base_url="http://localhost:11434/v1", api_key="ollama", ) # For LM Studio # client = OpenAI( # base_url="http://localhost:1234/v1", # api_key="lm-studio", # ) response = client.chat.completions.create( model="qwen3.5-coder:32b", messages=[ {"role": "system", "content": "You are a senior developer."}, {"role": "user", "content": "Review this function for bugs"}, ], ) print(response.choices[0].message.content) ``` ## Performance tips Getting the best performance out of local models requires understanding a few key concepts. ### Quantization Models come in different quantization levels that trade quality for speed and memory usage. Both Ollama and LM Studio handle this, but you can choose specific quantizations. ```bash # Q4_K_M - default, good balance (recommended) ollama run qwen3.5-coder:32b # Q8_0 - higher quality, more memory ollama run qwen3.5-coder:32b-q8_0 # Q2_K - smallest, fastest, lowest quality ollama run qwen3.5-coder:32b-q2_k ``` In LM Studio, you see the quantization level listed next to each download option. Look for "Q4_K_M" or "Q5_K_M" for the best balance. | Quantization | Quality | Size (32B model) | Speed | |-------------|---------|-------------------|-------| | Q2_K | Decent | ~12GB | Fastest | | Q4_K_M | Very Good | ~18GB | Fast | | Q5_K_M | Excellent | ~22GB | Medium | | Q8_0 | Near-Original | ~34GB | Slow | | FP16 | Original | ~64GB | Slowest | For coding tasks, Q4_K_M is the sweet spot. Below Q4, you start seeing noticeable quality degradation in code generation. Q8_0 is worth it if you have the VRAM. ### GPU vs CPU inference GPU inference is dramatically faster than CPU inference. If you have a dedicated GPU, make sure your tool is using it. ```bash # Check if Ollama detects your GPU ollama ps # Force GPU layers (useful for partial offloading) OLLAMA_NUM_GPU=999 ollama run llama4 ``` In LM Studio, the GPU offload slider in the model settings controls how many layers run on GPU. Set it to the maximum your VRAM allows. Approximate speed comparison for a 14B model: | Hardware | Tokens/second | Time for 500-token response | |----------|--------------|----------------------------| | NVIDIA RTX 4090 | 80-100 t/s | ~5 seconds | | NVIDIA RTX 4070 | 40-60 t/s | ~10 seconds | | Apple M3 Max (GPU) | 30-50 t/s | ~12 seconds | | Apple M2 Pro (GPU) | 20-35 t/s | ~18 seconds | | CPU only (modern) | 5-10 t/s | ~60 seconds | ### Memory requirements The golden rule: you need enough VRAM (or unified memory on Apple Silicon) to fit the entire model. If the model does not fit in VRAM, it spills to system RAM, which is 10-20x slower. ```bash # Check current memory usage ollama ps # Set maximum VRAM usage OLLAMA_MAX_VRAM=20000 ollama serve # 20GB limit ``` **Apple Silicon users:** You are in a good position. The unified memory architecture means your GPU can access all system RAM. A MacBook Pro with 36GB of unified memory can run 32B parameter models comfortably. **NVIDIA users:** Your VRAM is the hard limit. A 24GB RTX 4090 fits most 32B quantized models. For 70B+ models, you need multi-GPU setups or significant CPU offloading. ### Context length optimization Longer context windows use more memory. If you are running tight on VRAM, reduce the context length. ```bash # Default context length is 2048 # Increase for larger codebases ollama run qwen3.5-coder:32b --num-ctx 8192 # Reduce to save memory ollama run qwen3.5-coder:32b --num-ctx 1024 ``` In LM Studio, adjust the "Context Length" slider in the model settings panel before loading a model. ### Running multiple models Ollama can keep multiple models loaded in memory simultaneously. This is useful when you want a fast small model for autocomplete and a large model for complex tasks. ```bash # Load two models at once OLLAMA_MAX_LOADED_MODELS=2 ollama serve ``` LM Studio loads one model at a time in the chat interface but can serve a different model via the API server simultaneously. ## Comparison: local vs cloud API Neither local nor cloud is universally better. The right choice depends on your specific situation. ### When local models win - **High-volume usage.** If you send hundreds of requests per day, local inference is essentially free after hardware costs. Cloud APIs charge per token. - **Privacy requirements.** Regulated industries, proprietary codebases, or personal preference for data sovereignty. Local means no third-party data processing. - **Offline workflows.** Traveling, unreliable connections, or air-gapped environments. - **Latency-sensitive tasks.** Tab autocomplete, inline suggestions, and real-time code generation benefit from zero network latency. - **Predictable costs.** No surprise bills. The hardware cost is fixed regardless of usage. ### When cloud APIs win - **Maximum capability.** The largest cloud models (Claude, GPT-4.5, Gemini Ultra) are still significantly more capable than anything you can run locally. For complex multi-step reasoning, architectural decisions, or nuanced code review, cloud models have the edge. - **No hardware investment.** You do not need an expensive GPU. A $20/month API subscription gives you access to frontier models. - **Always up to date.** Cloud providers update models continuously. Local models require manual pulls and version management. - **Scale to zero.** Pay only when you use it. If you have light, sporadic usage, cloud APIs are more cost-effective than dedicated hardware. - **Multi-modal capabilities.** Cloud models increasingly support images, audio, and video inputs that local models cannot match. ### The hybrid approach (recommended) The best setup for most developers is a hybrid approach: - **Local model for autocomplete and quick tasks.** Run a fast 7B model for tab completion, inline suggestions, and quick questions. This handles 80% of your daily AI interactions with zero latency and zero cost. - **Cloud API for complex tasks.** Use Claude or GPT-4.5 for architectural decisions, complex refactoring, multi-file changes, and deep code review. These tasks benefit from the larger model's superior reasoning. ```bash # Example hybrid setup # Terminal 1: Ollama running locally for autocomplete ollama serve # Terminal 2: LM Studio for model exploration and testing # (launch the desktop app) # Terminal 3: Use Claude Code for complex tasks (cloud) claude # Your editor: Continue.dev with Ollama for autocomplete, # cloud model for chat ``` This gives you the best of both worlds: fast, free, private AI for routine tasks, and maximum capability when you need it. ## Troubleshooting ### Ollama is not detecting my GPU ```bash # Check GPU detection ollama ps # On Linux, ensure CUDA drivers are installed nvidia-smi # On macOS, Metal support is automatic for Apple Silicon # Intel Macs do not have GPU acceleration in Ollama ``` ### LM Studio shows "out of memory" when loading a model Your model is too large for your available VRAM. Try: 1. Choose a smaller quantization (Q4 instead of Q8) 2. Reduce the GPU offload slider so more layers run on CPU 3. Lower the context length 4. Close other GPU-intensive applications 5. Choose a smaller model variant (7B instead of 14B) ### Models are slow on first load but fast after This is normal. The first load reads the model from disk into memory. Subsequent inferences reuse the loaded model. Both Ollama and LM Studio keep models cached in memory until you explicitly unload them or run out of memory. ### API calls return connection refused Make sure the server is actually running: ```bash # For Ollama curl http://localhost:11434/api/tags # For LM Studio, check the Developer tab - the server toggle must be ON curl http://localhost:1234/v1/models ``` ## Next steps Now that you have local AI running, here are some ways to go deeper: - **Explore the model library.** Browse [ollama.com/library](https://ollama.com/library) or LM Studio's Discover tab for hundreds of available models. - **Create custom models.** Write an Ollama `Modelfile` to create models with custom system prompts and parameters. - **Set up a team server.** Run Ollama on a shared machine so your whole team can access local models over the network. - **Try different quantizations.** Experiment with Q4 vs Q8 for your specific use case to find your quality-speed sweet spot. - **Build with the API.** Use the OpenAI-compatible endpoints from either tool to integrate local AI into your own applications and scripts. Local AI is not a replacement for cloud models. It is a complement that fills a different niche: fast, private, free, and always available. Set it up once, and it becomes a natural part of your development workflow. ]]> Thu, 09 Apr 2026 00:00:00 GMT getting-started Guide Developers Digest <![CDATA[Building Your First MCP Server]]> https://www.developersdigest.tech/guides/building-your-first-mcp-server https://www.developersdigest.tech/guides/building-your-first-mcp-server { console.error("Fatal error:", error); process.exit(1); }); ``` Note that we log to `stderr` (via `console.error`), not `stdout`. This is important because stdout is reserved for the MCP protocol messages. Any logging you do must go to stderr. ### Step 2: Add your first tool Tools are the core of most MCP servers. Let's add a simple tool that fetches the current weather for a city. This demonstrates the pattern you will use for all tools: define a name, description, input schema, and handler function. Add this between the server creation and the `main()` function: ```typescript import { z } from "zod"; server.tool( "get_weather", "Get the current weather for a city. Returns temperature, conditions, and humidity.", { city: z.string().describe("The city name, e.g. 'San Francisco'"), units: z .enum(["celsius", "fahrenheit"]) .default("celsius") .describe("Temperature units"), }, async ({ city, units }) => { // In a real server, you would call a weather API here. // For this example, we return mock data. const temp = units === "celsius" ? 22 : 72; const unitLabel = units === "celsius" ? "C" : "F"; return { content: [ { type: "text", text: `Weather in ${city}: ${temp} degrees ${unitLabel}, partly cloudy, 65% humidity.`, }, ], }; } ); ``` Let's break down the four arguments to `server.tool()`: 1. **Name** (`"get_weather"`) - A unique identifier for the tool. AI clients use this name to call the tool. Use snake_case by convention. 2. **Description** - A natural language explanation of what the tool does. The AI model reads this description to decide when to use the tool. Be specific about inputs, outputs, and when the tool is appropriate. 3. **Input schema** - A Zod schema defining the parameters the tool accepts. The SDK validates inputs against this schema before calling your handler. Zod's `.describe()` method adds parameter-level descriptions that help the AI fill in the right values. 4. **Handler** - An async function that receives the validated inputs and returns a result. The result must include a `content` array with text or image blocks. ### Step 3: Add a tool with error handling Real tools need error handling. Here is a more realistic tool that reads a file from disk: ```typescript import fs from "fs/promises"; import path from "path"; server.tool( "read_file", "Read the contents of a file at the given path. Returns the file content as text. Fails if the file does not exist or cannot be read.", { filePath: z.string().describe("Absolute or relative path to the file"), }, async ({ filePath }) => { try { const resolvedPath = path.resolve(filePath); const content = await fs.readFile(resolvedPath, "utf-8"); return { content: [ { type: "text", text: content, }, ], }; } catch (error) { const message = error instanceof Error ? error.message : "Unknown error"; return { content: [ { type: "text", text: `Error reading file: ${message}`, }, ], isError: true, }; } } ); ``` Notice the `isError: true` flag in the error response. This tells the AI client that the tool invocation failed, so the model can adjust its approach (try a different path, ask the user for help, etc.) rather than treating the error message as successful output. ### Step 4: Add a tool that calls an external API Here is a tool that demonstrates calling a real external service - searching a database, calling a REST API, or querying a third-party service: ```typescript server.tool( "search_github_repos", "Search GitHub repositories by keyword. Returns the top 5 matching repos with name, description, stars, and URL.", { query: z.string().describe("Search query for GitHub repositories"), language: z .string() .optional() .describe("Filter by programming language, e.g. 'typescript'"), }, async ({ query, language }) => { const params = new URLSearchParams({ q: language ? `${query} language:${language}` : query, sort: "stars", order: "desc", per_page: "5", }); const response = await fetch( `https://api.github.com/search/repositories?${params}`, { headers: { Accept: "application/vnd.github.v3+json", "User-Agent": "my-mcp-server", }, } ); if (!response.ok) { return { content: [ { type: "text", text: `GitHub API error: ${response.status} ${response.statusText}`, }, ], isError: true, }; } const data = await response.json(); const repos = data.items.map( (repo: { full_name: string; description: string | null; stargazers_count: number; html_url: string; }) => ({ name: repo.full_name, description: repo.description || "No description", stars: repo.stargazers_count, url: repo.html_url, }) ); return { content: [ { type: "text", text: JSON.stringify(repos, null, 2), }, ], }; } ); ``` ### Step 5: Add resources Resources provide read-only data to the AI client. They are useful for exposing configuration files, database state, or any data the model might need for context. ```typescript server.resource( "config", "config://app/settings", async (uri) => { // In a real server, read from a config file or database const config = { appName: "My Application", version: "2.1.0", environment: process.env.NODE_ENV || "development", features: { darkMode: true, notifications: true, analytics: false, }, }; return { contents: [ { uri: uri.href, mimeType: "application/json", text: JSON.stringify(config, null, 2), }, ], }; } ); ``` The `server.resource()` method takes three arguments: 1. **Name** - A human-readable name for the resource. 2. **URI** - A unique identifier using a custom scheme (like `config://` or `db://`). Clients use this URI to request the resource. 3. **Handler** - An async function that returns the resource contents. The handler receives the parsed URI object. You can also add resources with dynamic URIs using templates: ```typescript server.resource( "user-profile", "users://{userId}/profile", async (uri) => { // Extract the userId from the URI const userId = uri.pathname.split("/")[1]; // Fetch user data (mock example) const user = { id: userId, name: "Jane Developer", email: "jane@example.com", role: "admin", }; return { contents: [ { uri: uri.href, mimeType: "application/json", text: JSON.stringify(user, null, 2), }, ], }; } ); ``` ### Step 6: Add prompt templates Prompt templates are reusable prompt structures that help standardize how the AI interacts with your domain. They are optional but useful for common workflows. ```typescript server.prompt( "code_review", "Review code for bugs, security issues, and best practices", { code: z.string().describe("The code to review"), language: z.string().describe("Programming language of the code"), focus: z .enum(["bugs", "security", "performance", "all"]) .default("all") .describe("What to focus the review on"), }, ({ code, language, focus }) => { const focusInstructions = { bugs: "Focus specifically on bugs, logic errors, and edge cases that could cause failures.", security: "Focus specifically on security vulnerabilities, injection risks, and unsafe patterns.", performance: "Focus specifically on performance bottlenecks, unnecessary allocations, and optimization opportunities.", all: "Review for bugs, security issues, performance problems, and general best practices.", }; return { messages: [ { role: "user", content: { type: "text", text: `Review the following ${language} code.\n\n${focusInstructions[focus]}\n\n\`\`\`${language}\n${code}\n\`\`\``, }, }, ], }; } ); ``` ### Step 7: The complete server Here is the full `src/index.ts` with all the pieces assembled: ```typescript #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod"; import fs from "fs/promises"; import path from "path"; const server = new McpServer({ name: "my-mcp-server", version: "1.0.0", }); // --- Tools --- server.tool( "get_weather", "Get the current weather for a city", { city: z.string().describe("The city name"), units: z.enum(["celsius", "fahrenheit"]).default("celsius"), }, async ({ city, units }) => { const temp = units === "celsius" ? 22 : 72; const unitLabel = units === "celsius" ? "C" : "F"; return { content: [ { type: "text", text: `Weather in ${city}: ${temp} degrees ${unitLabel}, partly cloudy, 65% humidity.`, }, ], }; } ); server.tool( "read_file", "Read the contents of a file at the given path", { filePath: z.string().describe("Path to the file"), }, async ({ filePath }) => { try { const content = await fs.readFile(path.resolve(filePath), "utf-8"); return { content: [{ type: "text", text: content }], }; } catch (error) { return { content: [ { type: "text", text: `Error: ${error instanceof Error ? error.message : "Unknown error"}`, }, ], isError: true, }; } } ); server.tool( "search_github_repos", "Search GitHub repositories by keyword", { query: z.string().describe("Search query"), language: z.string().optional().describe("Filter by language"), }, async ({ query, language }) => { const params = new URLSearchParams({ q: language ? `${query} language:${language}` : query, sort: "stars", order: "desc", per_page: "5", }); const response = await fetch( `https://api.github.com/search/repositories?${params}`, { headers: { Accept: "application/vnd.github.v3+json", "User-Agent": "my-mcp-server", }, } ); if (!response.ok) { return { content: [ { type: "text", text: `GitHub API error: ${response.status}` }, ], isError: true, }; } const data = await response.json(); const repos = data.items.map( (repo: { full_name: string; description: string | null; stargazers_count: number; html_url: string; }) => ({ name: repo.full_name, description: repo.description || "No description", stars: repo.stargazers_count, url: repo.html_url, }) ); return { content: [{ type: "text", text: JSON.stringify(repos, null, 2) }], }; } ); // --- Resources --- server.resource("config", "config://app/settings", async (uri) => { return { contents: [ { uri: uri.href, mimeType: "application/json", text: JSON.stringify( { appName: "My Application", version: "2.1.0", environment: process.env.NODE_ENV || "development", }, null, 2 ), }, ], }; }); // --- Prompt Templates --- server.prompt( "code_review", "Review code for bugs, security issues, and best practices", { code: z.string().describe("The code to review"), language: z.string().describe("Programming language"), }, ({ code, language }) => ({ messages: [ { role: "user", content: { type: "text", text: `Review the following ${language} code for bugs, security issues, and best practices:\n\n\`\`\`${language}\n${code}\n\`\`\``, }, }, ], }) ); // --- Start --- async function main() { const transport = new StdioServerTransport(); await server.connect(transport); console.error("MCP server running on stdio"); } main().catch((error) => { console.error("Fatal error:", error); process.exit(1); }); ``` ## Build and test ### Build the server ```bash npm run build ``` This compiles TypeScript to JavaScript in the `dist/` directory. Make the output executable: ```bash chmod +x dist/index.js ``` ### Test with Claude Code The fastest way to test your MCP server is with Claude Code. Add it to your project's `.mcp.json` file: ```json { "mcpServers": { "my-server": { "command": "node", "args": ["/absolute/path/to/my-mcp-server/dist/index.js"] } } } ``` Replace the path with the actual absolute path to your compiled server. Start Claude Code in the project directory: ```bash claude ``` Your MCP tools should appear in the tool list. Ask Claude to use one: ``` use the get_weather tool to check the weather in Tokyo ``` ``` search GitHub for the top MCP server repositories written in TypeScript ``` If the tools do not appear, check for errors by running the server manually: ```bash node dist/index.js ``` Any errors will print to stderr. Common issues: - Missing `#!/usr/bin/env node` shebang line - File not executable (run `chmod +x`) - Module resolution errors (check `tsconfig.json` module settings) - Missing dependencies (run `npm install`) ### Test with the MCP Inspector The MCP Inspector is an official debugging tool that lets you interact with your server directly through a web UI. ```bash npx @modelcontextprotocol/inspector node dist/index.js ``` This opens a browser window where you can: - See all registered tools, resources, and prompts - Call tools with custom inputs and inspect the responses - Read resources and verify their output - Test prompt templates with different parameters The Inspector is invaluable during development. Use it to verify your server works correctly before connecting it to an AI client. ### Test with Cursor Add the server to Cursor's MCP configuration. Open `.cursor/mcp.json` in your project: ```json { "mcpServers": { "my-server": { "command": "node", "args": ["/absolute/path/to/my-mcp-server/dist/index.js"] } } } ``` Restart Cursor, and the tools will be available in Agent mode. ## Adding environment variables Most real MCP servers need API keys, database URLs, or other configuration. Pass these as environment variables in the MCP config: ```json { "mcpServers": { "my-server": { "command": "node", "args": ["/path/to/dist/index.js"], "env": { "WEATHER_API_KEY": "your-api-key-here", "DATABASE_URL": "postgresql://localhost:5432/mydb" } } } } ``` Access them in your server code with `process.env`: ```typescript server.tool( "get_real_weather", "Get real weather data from the weather API", { city: z.string().describe("City name"), }, async ({ city }) => { const apiKey = process.env.WEATHER_API_KEY; if (!apiKey) { return { content: [ { type: "text", text: "Error: WEATHER_API_KEY not configured" }, ], isError: true, }; } const response = await fetch( `https://api.weather.com/v1/current?city=${encodeURIComponent(city)}&key=${apiKey}` ); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }], }; } ); ``` ## Publishing and deployment ### Publish to npm If you want others to use your MCP server, publish it to npm: 1. Make sure `package.json` has the `bin` field set 2. Add a `files` field to include only the dist directory: ```json { "name": "my-mcp-server", "version": "1.0.0", "type": "module", "bin": { "my-mcp-server": "dist/index.js" }, "files": ["dist"], "scripts": { "build": "tsc", "prepublishOnly": "npm run build" } } ``` 3. Build and publish: ```bash npm run build npm publish ``` Users can then configure it in their MCP settings with: ```json { "mcpServers": { "my-server": { "command": "npx", "args": ["-y", "my-mcp-server"] } } } ``` The `npx -y` prefix downloads and runs the package automatically. ### Deploy as an HTTP server For team-wide or remote access, you can serve your MCP server over HTTP instead of stdio. Replace the transport setup: ```typescript import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js"; import express from "express"; const app = express(); app.use(express.json()); app.post("/mcp", async (req, res) => { const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined, }); res.writeHead(200, { "Content-Type": "text/event-stream", "Cache-Control": "no-cache", Connection: "keep-alive", }); await server.connect(transport); await transport.handleRequest(req, res); }); app.listen(3001, () => { console.error("MCP HTTP server running on port 3001"); }); ``` Clients connect using the HTTP transport: ```json { "mcpServers": { "my-server": { "url": "http://localhost:3001/mcp" } } } ``` ## Best practices ### Write clear tool descriptions The description is the most important part of a tool definition. The AI model reads it to decide when and how to use the tool. Good descriptions include: - What the tool does in one sentence - What inputs are required and what format they should be in - What the output looks like - When to use this tool vs another tool - Any limitations or side effects Bad: `"Search stuff"` Good: `"Search GitHub repositories by keyword. Returns the top 5 matching repos with name, description, star count, and URL. Use this when the user asks about open-source projects, libraries, or wants to find code repositories."` ### Keep tools focused Each tool should do one thing well. A tool called `manage_database` that creates tables, runs queries, and manages migrations is hard for the AI to use correctly. Split it into `create_table`, `run_query`, and `run_migration`. ### Validate inputs thoroughly The Zod schema handles basic type validation, but add your own validation for business logic: ```typescript server.tool( "delete_file", "Delete a file at the given path", { filePath: z.string().describe("Path to the file to delete"), }, async ({ filePath }) => { const resolved = path.resolve(filePath); // Safety check: prevent deletion outside the project directory if (!resolved.startsWith(process.cwd())) { return { content: [ { type: "text", text: "Error: Cannot delete files outside the project directory", }, ], isError: true, }; } await fs.unlink(resolved); return { content: [{ type: "text", text: `Deleted ${resolved}` }], }; } ); ``` ### Return structured data when possible JSON responses let the AI extract specific fields and use them in follow-up operations. Plain text responses work for simple outputs, but structured data scales better: ```typescript // Prefer this return { content: [ { type: "text", text: JSON.stringify( { status: "success", filesCreated: 3, outputPath: "/tmp/output", }, null, 2 ), }, ], }; ``` ### Handle errors gracefully Always return a meaningful error message with `isError: true` rather than throwing an exception. Thrown exceptions crash the tool invocation and give the AI no information about what went wrong. A descriptive error message lets the AI retry with different inputs or ask the user for help. ## Next steps Now that you have a working MCP server, explore these directions: - **Browse existing servers.** The [MCP servers repository](https://github.com/modelcontextprotocol/servers) has dozens of production-quality servers you can study and use as references. - **Add authentication.** For HTTP-based servers, add API key validation or OAuth to control access. - **Build domain-specific tools.** Create servers for your team's internal tools - Jira, Slack, your production database, deployment pipeline, monitoring dashboards. - **Use resource subscriptions.** Resources can notify clients when their data changes, enabling real-time context updates. - **Read the [MCP specification](https://spec.modelcontextprotocol.io/).** The full spec covers advanced features like sampling, logging, and capability negotiation. MCP is still early but growing fast. Every major AI coding tool now supports it, and the ecosystem of community servers expands weekly. Building your own server is the best way to understand the protocol and create tools that fit your exact workflow. ]]> Thu, 09 Apr 2026 00:00:00 GMT ai-agents Guide Developers Digest <![CDATA[AI Agent Frameworks Compared: CrewAI vs LangGraph vs AutoGen vs Claude Code]]> https://www.developersdigest.tech/guides/ai-agent-frameworks-compared https://www.developersdigest.tech/guides/ai-agent-frameworks-compared "research" | +-- Edge: if complete -> "write" | +-- Node: "write" (function) | | | +-- Edge: -> "review" | +-- Node: "review" (function) | +-- Edge: if approved -> END +-- Edge: if needs_revision -> "write" ``` LangGraph uses a state machine pattern. You define a state schema, nodes that transform state, and edges (including conditional edges) that determine the next node based on the current state. This makes complex workflows with loops, branches, and dynamic routing straightforward. ### Code example ```python from typing import TypedDict, Annotated from langgraph.graph import StateGraph, END from langchain_anthropic import ChatAnthropic from langchain_core.messages import HumanMessage, SystemMessage # Define the state schema class AgentState(TypedDict): topic: str research: str draft: str review_feedback: str final_article: str revision_count: int # Initialize the model model = ChatAnthropic(model="claude-sonnet-4-20250514") # Define node functions def research_node(state: AgentState) -> dict: messages = [ SystemMessage(content="You are a thorough research analyst."), HumanMessage( content=f"Research the topic: {state['topic']}. " f"Provide detailed findings with sources." ), ] response = model.invoke(messages) return {"research": response.content} def write_node(state: AgentState) -> dict: context = state.get("review_feedback", "") revision_note = ( f"\n\nPrevious feedback to address:\n{context}" if context else "" ) messages = [ SystemMessage( content="You are a technical writer for developers." ), HumanMessage( content=f"Write a 1500-word article based on this research:\n\n" f"{state['research']}{revision_note}" ), ] response = model.invoke(messages) return { "draft": response.content, "revision_count": state.get("revision_count", 0) + 1, } def review_node(state: AgentState) -> dict: messages = [ SystemMessage( content="You are a strict technical editor. Respond with either " "'APPROVED' followed by the final text, or 'NEEDS_REVISION' " "followed by specific feedback." ), HumanMessage(content=f"Review this article:\n\n{state['draft']}"), ] response = model.invoke(messages) if "APPROVED" in response.content[:20]: return { "final_article": response.content.replace("APPROVED", "").strip(), "review_feedback": "", } else: return { "review_feedback": response.content.replace( "NEEDS_REVISION", "" ).strip() } # Define routing logic def should_revise(state: AgentState) -> str: if state.get("final_article"): return "end" if state.get("revision_count", 0) >= 3: # Give up after 3 revisions return "end" return "revise" # Build the graph graph = StateGraph(AgentState) # Add nodes graph.add_node("research", research_node) graph.add_node("write", write_node) graph.add_node("review", review_node) # Add edges graph.set_entry_point("research") graph.add_edge("research", "write") graph.add_edge("write", "review") # Conditional edge: review can loop back to write or finish graph.add_conditional_edges( "review", should_revise, { "revise": "write", "end": END, }, ) # Compile and run app = graph.compile() result = app.invoke({ "topic": "Building MCP servers in TypeScript", "research": "", "draft": "", "review_feedback": "", "final_article": "", "revision_count": 0, }) print(result["final_article"]) ``` ### Strengths - **Maximum control.** Every aspect of the workflow is explicit: state schema, node functions, routing logic, and error handling. Nothing is hidden or magical. - **Complex workflows.** Loops, branches, parallel execution, conditional routing, and dynamic node selection are first-class features. If you can draw it as a flowchart, you can build it in LangGraph. - **Stateful by design.** The explicit state schema makes it easy to inspect, checkpoint, and resume workflows. You can save state to a database and resume later, which is essential for long-running tasks. - **Streaming support.** Full streaming of intermediate steps and final output. You can show users what each node is doing in real time. - **Language support.** Official Python and TypeScript/JavaScript SDKs, both production-quality. - **LangSmith integration.** Built-in tracing and observability through LangSmith (LangChain's monitoring platform). Every node execution, LLM call, and state transition is logged and inspectable. ### Weaknesses - **Steep learning curve.** The graph/state-machine paradigm is powerful but takes time to internalize. Simple tasks that take 10 lines in CrewAI require 50+ lines in LangGraph. - **Verbose boilerplate.** State schemas, node functions, edge definitions, and compilation add significant code overhead for simple workflows. - **LangChain dependency.** LangGraph is part of the LangChain ecosystem. While it works standalone, the most useful integrations pull in LangChain dependencies. If you have opinions about LangChain, those opinions apply here too. - **Over-engineering risk.** The flexibility of graphs makes it tempting to build overly complex workflows. Simple sequential pipelines do not need conditional edges and state machines. - **Documentation density.** The docs are comprehensive but dense. Finding the right pattern for your use case can take digging. ### When to use LangGraph Choose LangGraph when your workflow has complex control flow - loops, branches, conditional execution, parallel paths, or human-in-the-loop checkpoints. It is the right choice for production systems where you need explicit state management, observability, and the ability to resume failed workflows. If your workflow is simple and sequential, LangGraph is overkill. --- ## AutoGen AutoGen (by Microsoft) models multi-agent systems as conversations between agents. Instead of defining a graph or a task pipeline, you create agents and put them in a group chat where they talk to each other to solve problems. The framework handles turn-taking, message routing, and termination. ### Architecture ``` [GroupChat] | +-- Agent: Assistant (LLM-based) | "I'll write the code." | +-- Agent: Critic (LLM-based) | "Here are issues with the code." | +-- Agent: Executor (code execution) | "I ran it. Here's the output." | +-- Agent: UserProxy (human-in-the-loop) "Looks good, proceed." ``` AutoGen's conversation-based approach is natural for tasks that benefit from debate, critique, and iterative refinement. Agents exchange messages in a shared conversation, and a speaker-selection mechanism determines who speaks next. ### Code example ```python from autogen import ( AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager, ) # Configuration for the LLM llm_config = { "config_list": [ { "model": "claude-sonnet-4-20250514", "api_key": "your-api-key", "api_type": "anthropic", } ], "temperature": 0.3, } # Define agents coder = AssistantAgent( name="Coder", system_message=( "You are a senior software engineer. You write clean, well-tested " "TypeScript code. When asked to build something, provide complete, " "runnable code. Always include error handling." ), llm_config=llm_config, ) reviewer = AssistantAgent( name="Reviewer", system_message=( "You are a code reviewer. You examine code for bugs, security " "issues, performance problems, and adherence to best practices. " "Be specific in your feedback. When the code is good, say APPROVED." ), llm_config=llm_config, ) tester = AssistantAgent( name="Tester", system_message=( "You are a QA engineer. You write unit tests for the code provided. " "Use vitest for TypeScript tests. Aim for edge cases and error " "conditions, not just happy paths." ), llm_config=llm_config, ) # UserProxy executes code and provides human input user_proxy = UserProxyAgent( name="UserProxy", human_input_mode="TERMINATE", max_consecutive_auto_reply=10, code_execution_config={ "work_dir": "workspace", "use_docker": False, }, ) # Create group chat group_chat = GroupChat( agents=[user_proxy, coder, reviewer, tester], messages=[], max_round=15, speaker_selection_method="auto", ) manager = GroupChatManager( groupchat=group_chat, llm_config=llm_config, ) # Start the conversation user_proxy.initiate_chat( manager, message=( "Build a TypeScript CLI tool that converts CSV files to JSON. " "It should handle headers, quoted fields, and custom delimiters. " "Include error handling for malformed input." ), ) ``` ### Strengths - **Natural conversation flow.** The group chat pattern feels intuitive for tasks that benefit from discussion, debate, and iterative refinement. Agents naturally build on each other's contributions. - **Code execution.** Built-in support for running code in sandboxed environments (Docker or local). Agents can write code, execute it, see the output, and fix issues in a loop. - **Human-in-the-loop.** The UserProxy agent makes it easy to insert human approval, feedback, or corrections at any point in the conversation. - **Flexible speaker selection.** The framework can automatically decide which agent should speak next based on the conversation context, or you can define explicit turn-taking rules. - **Microsoft ecosystem.** Deep integration with Azure OpenAI, and strong support from Microsoft Research. Active development and regular releases. ### Weaknesses - **Unpredictable execution.** The conversation-based approach means you do not always know how many turns a task will take or which agent will handle what. This makes cost estimation and timeout management harder than in deterministic frameworks. - **Token cost.** Every agent sees the full conversation history. With 4 agents and 15 rounds, the context grows rapidly. Long conversations can burn through tokens fast. - **Limited structure.** There is no built-in concept of "tasks" or "workflow steps." The structure emerges from the conversation, which can be both a strength (flexibility) and a weakness (unpredictability). - **Speaker selection issues.** The auto speaker selection sometimes picks the wrong agent or gets stuck in loops. Custom speaker selection functions help but add complexity. - **Setup complexity.** Configuration objects, agent definitions, and execution environments have many options. Getting the right configuration for your use case takes experimentation. ### When to use AutoGen Choose AutoGen when your problem benefits from iterative discussion between agents - code generation with review cycles, research with fact-checking, or any task where agents need to debate and refine each other's work. It is particularly strong for code-generation workflows where agents write, test, review, and fix code in a conversational loop. If you need deterministic, repeatable workflows, look elsewhere. --- ## Claude Code Claude Code is different from the other three frameworks. It is not a library you import into your code - it is a complete AI coding agent that runs in your terminal (or IDE, or web browser). You interact with it through natural language, and it reads your codebase, edits files, runs commands, and manages git operations. What makes Claude Code relevant as an "agent framework" is its sub-agent system. You can spawn multiple Claude Code instances as sub-agents, each working on a separate task in parallel, coordinated by a parent agent. Combined with MCP servers for external tool integration and hooks for lifecycle automation, Claude Code functions as a full agent orchestration system. ### Architecture ``` [Claude Code - Parent Agent] | +-- Sub-Agent: "Research the API docs" | (reads files, searches web, returns summary) | +-- Sub-Agent: "Write the implementation" | (edits files, runs tests, fixes errors) | +-- Sub-Agent: "Update the documentation" | (reads code changes, updates README and docs) | +-- MCP Server: Database (query, insert, update) +-- MCP Server: Deployment (deploy, rollback, status) +-- Hooks: pre-commit linter, post-edit test runner ``` ### Code example (SDK usage) While Claude Code is primarily a CLI tool, the Claude Code SDK lets you use it programmatically in TypeScript: ```typescript import { ClaudeCode } from "@anthropic-ai/claude-code"; const claude = new ClaudeCode(); // Simple one-shot task const result = await claude.run({ prompt: "Add input validation to the signup form in src/components/SignupForm.tsx", workingDirectory: "/path/to/project", }); console.log(result.output); // Multi-step workflow with sub-agents async function buildFeature(featureDescription: string) { // Step 1: Research const research = await claude.run({ prompt: `Analyze the current codebase and determine the best approach for: ${featureDescription}. Do not make any changes. Return a plan.`, workingDirectory: "/path/to/project", }); // Step 2: Implement (using the research as context) const implementation = await claude.run({ prompt: `Implement this feature based on the following plan:\n\n${research.output}\n\nWrite the code, run the tests, and fix any failures.`, workingDirectory: "/path/to/project", }); // Step 3: Review const review = await claude.run({ prompt: "Review all changes made in the last commit. Check for bugs, security issues, and missing test coverage. Fix any issues you find.", workingDirectory: "/path/to/project", }); return { research, implementation, review }; } const result = await buildFeature("Add dark mode support with system preference detection"); ``` ### CLI workflow example Most Claude Code usage happens interactively in the terminal: ```bash # Start a session cd ~/my-project claude # Inside the session, use natural language: # "Add a rate limiter to the API endpoints" # "Write tests for the payment module and fix any failures" # "Refactor the auth middleware to use the new session system" # Or use non-interactive mode for scripting: claude -p "Add TypeScript strict mode to this project and fix all type errors" # Spawn sub-agents for parallel work: # (Inside a Claude Code session) # "Parallelize this: research the Stripe API, write the webhook handler, # and update the docs - use sub-agents for each task" ``` ### Strengths - **Zero boilerplate.** No framework setup, no agent definitions, no state schemas. Point it at a codebase and describe what you want. - **Full codebase understanding.** Claude Code reads your entire project - files, imports, dependencies, git history, tests. It has context that API-based frameworks cannot match. - **Real tool execution.** It actually runs commands, edits files, and verifies its work by running tests. This is not simulated tool use - it is real system interaction. - **MCP integration.** Connect any MCP server to extend Claude Code's capabilities. Database access, deployment pipelines, monitoring dashboards - all available as tools. - **Sub-agent parallelism.** Spawn multiple agents working on different tasks simultaneously. A parent agent coordinates and synthesizes the results. - **Hooks system.** Automate pre/post actions: run linters before commits, execute tests after edits, trigger deployments after merges. - **Cross-platform.** CLI, VS Code, JetBrains, desktop app, web interface, Slack, GitHub Actions - same agent, same config, multiple surfaces. ### Weaknesses - **Claude-only.** Locked to Anthropic's Claude models. You cannot swap in GPT, Gemini, or open-source models. If Claude goes down or Anthropic changes pricing, you have no fallback. - **Not a library.** You cannot embed Claude Code's agent logic into your own Python or Node application the way you can with CrewAI or LangGraph. The SDK gives you programmatic access but not framework-level control over the agent loop. - **Cost.** Claude Code uses Claude models, which are not free. Heavy usage on Max plan ($200/month) or API billing can get expensive compared to running open-source models with other frameworks. - **Less customizable orchestration.** You describe what you want in natural language. You cannot define explicit state machines, conditional edges, or custom routing logic the way you can in LangGraph. - **Subscription required.** Requires a Claude Pro, Max, Teams, or Enterprise subscription, or Anthropic API credits. ### When to use Claude Code Choose Claude Code when your primary task is software development - writing code, fixing bugs, refactoring, adding features, managing git. It is the most capable coding agent available and requires zero framework setup. For multi-agent orchestration beyond coding (content pipelines, data processing, business workflows), pair it with one of the other frameworks or use the SDK to build custom orchestration. --- ## Decision framework Use this flowchart to pick the right framework for your project. **Start here: What is your primary task?** **If code generation and development automation:** - Use **Claude Code**. It understands codebases natively, runs real commands, and requires no setup. For complex multi-repo orchestration, add the SDK. **If content/research pipeline with defined roles:** - Use **CrewAI**. The crew metaphor maps perfectly to content workflows where specialists hand off work in sequence. Fastest time to working prototype. **If complex stateful workflow with branches and loops:** - Use **LangGraph**. When you need explicit control over execution flow, state checkpointing, conditional routing, and resumable workflows, LangGraph is the only choice that gives you full control. **If iterative refinement through debate/critique:** - Use **AutoGen**. When agents need to discuss, critique, and iteratively improve each other's work, the conversation-based model is the most natural fit. **If you need multiple frameworks:** - This is common and fine. Use Claude Code for the coding tasks and CrewAI or LangGraph for the orchestration layer. They are not mutually exclusive. ## Combining frameworks In practice, production systems often combine frameworks. Here are patterns that work well: **Claude Code + LangGraph:** Use LangGraph to define the overall workflow (research, implement, test, deploy) and spawn Claude Code sub-agents for the coding steps. LangGraph handles state management and routing; Claude Code handles the actual development. **CrewAI + Claude Code:** Use a CrewAI crew for content generation (research, write, edit) and trigger Claude Code to implement any code examples or build any tools referenced in the content. **LangGraph + AutoGen:** Use LangGraph for the high-level workflow graph and AutoGen group chats within specific nodes where agents need to discuss and iterate. ## Final comparison | Dimension | CrewAI | LangGraph | AutoGen | Claude Code | |-----------|--------|-----------|---------|-------------| | **Time to prototype** | Hours | Days | Hours | Minutes | | **Production readiness** | Medium | High | Medium | High | | **Debugging experience** | Fair | Good | Fair | Good | | **Cost at scale** | Varies by model | Varies by model | Varies by model | Claude pricing | | **Community size** | Large, growing | Large, mature | Large, growing | Very large | | **Documentation** | Good | Dense but thorough | Improving | Excellent | | **TypeScript support** | No | Yes | No (Python/.NET) | Native | | **Custom model support** | Yes (any) | Yes (any) | Yes (any) | No (Claude only) | | **Determinism** | Low-Medium | High | Low | Low-Medium | | **Max complexity** | Medium | Very High | Medium | High | There is no universally "best" framework. Each one reflects a different philosophy about how agents should work. CrewAI says agents are team members. LangGraph says agents are nodes in a graph. AutoGen says agents are participants in a conversation. Claude Code says the agent is your pair programmer. Pick the philosophy that matches your problem, and you will build faster with fewer headaches. ## Next steps - **[CrewAI docs](https://docs.crewai.com/)** - Official documentation and tutorials - **[LangGraph docs](https://langchain-ai.github.io/langgraph/)** - Tutorials, how-to guides, and API reference - **[AutoGen docs](https://microsoft.github.io/autogen/)** - Getting started and advanced patterns - **[Claude Code docs](https://docs.anthropic.com/en/docs/claude-code)** - Setup, configuration, and best practices - **[AI Agents Explained](/blog/ai-agents-explained)** - Foundations of how AI agents work - **[Multi-Agent Systems](/blog/multi-agent-systems)** - Deep dive into multi-agent architectures - **[Building Your First MCP Server](/guides/building-your-first-mcp-server)** - Build tools that any MCP-compatible agent can use ]]> Thu, 09 Apr 2026 00:00:00 GMT ai-agents Guide Developers Digest <![CDATA[Getting Started with Claude Code]]> https://www.developersdigest.tech/guides/claude-code-getting-started https://www.developersdigest.tech/guides/claude-code-getting-started Thu, 02 Apr 2026 00:00:00 GMT getting-started Guide Developers Digest <![CDATA[Getting Started with DevDigest CLI]]> https://www.developersdigest.tech/guides/getting-started https://www.developersdigest.tech/guides/getting-started You are working on a Next.js 16 project scaffolded with the DevDigest CLI. Read the CLAUDE.md file for full stack details. The project uses Convex for the backend, Clerk for auth, and Autumn for billing. All environment variables are listed in .env.example. ]]> Sun, 08 Mar 2026 00:00:00 GMT getting-started Guide Developers Digest <![CDATA[Claude Code Setup Guide]]> https://www.developersdigest.tech/guides/claude-code-setup https://www.developersdigest.tech/guides/claude-code-setup **Prerequisites:** Node.js 18+, a terminal (macOS/Linux/WSL), and an Anthropic subscription (Pro $20/mo or Max $200/mo). Familiarity with the command line is assumed. Claude Code is a terminal-based AI coding agent from Anthropic. It reads your codebase, edits files, runs tests, and commits -- all autonomously. ## Install ```bash npm install -g @anthropic-ai/claude-code ``` ## CLAUDE.md -- Your project's AI brain Create a `CLAUDE.md` in your project root. This file tells Claude Code about your project: ```markdown # My Project ## Stack Next.js 16 + Convex + Clerk + Tailwind CSS v4 ## Key Directories - src/app/ -- Pages and layouts - src/components/ -- React components - convex/ -- Backend functions ## Commands - npm run dev -- Start dev server - npx convex dev -- Start backend ``` ## Agent prompt Copy this prompt to get started: > Read the CLAUDE.md file and understand the project structure. You are an expert in the stack described. Follow the conventions in CLAUDE.md for all code changes. ## MCP Servers Connect external tools to Claude Code via MCP: ```json { "mcpServers": { "devdigest": { "command": "dd", "args": ["mcp"] } } } ``` ## Sub-agents Claude Code can spawn sub-agents for parallel work: ``` Use the Task tool to spawn agents for: - Research tasks - Independent file edits - Running tests in parallel ``` ## Tips - Keep CLAUDE.md under 200 lines -- concise beats comprehensive - Use memory files in `.claude/` for session-specific context - Run `claude --dangerously-skip-permissions` for fully autonomous mode (use with caution) ]]> Sun, 08 Mar 2026 00:00:00 GMT ai-agents Guide Developers Digest <![CDATA[MCP Servers Explained]]> https://www.developersdigest.tech/guides/mcp-servers https://www.developersdigest.tech/guides/mcp-servers **Prerequisites:** Node.js 18+, an AI coding tool that supports MCP (Claude Code, Cursor, or Windsurf), and basic TypeScript/JavaScript knowledge. MCP (Model Context Protocol) lets AI tools connect to external services. Think of it as USB ports for AI -- plug in any tool and your AI agent can use it. ## How it works 1. An MCP server exposes **tools** (functions the AI can call) 2. Your AI client (Claude Code, Cursor, etc.) connects to the server 3. The AI can now call those tools as part of its workflow ## Example: DevDigest MCP Server The `dd mcp` command starts an MCP server with these tools: - `init_project` -- Scaffold a new project - `list_commands` -- Show available commands ## Add to Claude Code In your project's `.mcp.json`: ```json { "mcpServers": { "devdigest": { "command": "dd", "args": ["mcp"] } } } ``` ## Build your own MCP server ```typescript import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod"; const server = new McpServer({ name: "my-server", version: "1.0.0", }); server.tool( "hello", "Say hello", { name: z.string() }, async ({ name }) => ({ content: [{ type: "text", text: `Hello, ${name}!` }], }) ); const transport = new StdioServerTransport(); await server.connect(transport); ``` ## Agent prompt Copy this to give your AI agent MCP context: > This project uses MCP servers for external tool integration. Check .mcp.json for available servers. You can call MCP tools directly -- they appear as regular tools in your tool list. ]]> Sun, 08 Mar 2026 00:00:00 GMT ai-agents Guide Developers Digest