AI Agents Deep Dive
5 partsTL;DR
A practical comparison of the five major AI agent frameworks in 2026 - architecture, code examples, and a decision matrix to help you pick the right one.
Six months ago, building an AI agent meant writing a ReAct loop from scratch. Now there are at least five production-grade frameworks competing for your codebase, each with a fundamentally different philosophy on how agents should work. Pick wrong and you will rewrite your orchestration layer in six months. Pick right and you ship weeks faster.
This guide puts LangGraph, CrewAI, AutoGen/AG2, Claude Agent SDK, and Vercel AI SDK through the same lens: architecture, code, pros, cons, and when to use each one. No marketing fluff. Just the trade-offs that matter.
Raw API calls work for simple single-tool agents. But the moment your agent needs any two of the following, a framework starts earning its keep:
Think of agent frameworks like web frameworks. You could build a web app with raw sockets and HTTP parsing, but Express or Next.js handles routing, middleware, and error handling so you focus on business logic. Agent frameworks do the same for LLM orchestration.

Latest version: 1.0.10 | GitHub: 24.6K stars | Downloads: 38M+ monthly
LangGraph models agents as directed graphs. Nodes are functions. Edges are transitions. State flows through the graph as a typed dictionary, and every node can read from and write to that state.
The core abstraction is a StateGraph. You define a state schema, add nodes as functions, connect them with edges (including conditional edges that branch based on state), and compile the graph into a runnable. Built-in checkpointing means every state transition persists automatically, so a crashed agent resumes exactly where it stopped. Version 1.0 added durable state that survives server restarts, cross-thread memory, and Command for dynamic edgeless flows.
from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal
class AgentState(TypedDict):
query: str
category: Literal["code", "docs", "general"] | None
response: str | None
def classify(state: AgentState) -> AgentState:
category = llm.invoke(f"Classify: {state['query']}")
return {"category": category}
def handle_code(state: AgentState) -> AgentState:
response = llm.invoke(f"Help with code: {state['query']}")
return {"response": response}
def handle_docs(state: AgentState) -> AgentState:
response = llm.invoke(f"Find docs for: {state['query']}")
return {"response": response}
def route(state: AgentState) -> str:
if state["category"] == "code":
return "handle_code"
elif state["category"] == "docs":
return "handle_docs"
return END
graph = StateGraph(AgentState)
graph.add_node("classify", classify)
graph.add_node("handle_code", handle_code)
graph.add_node("handle_docs", handle_docs)
graph.set_entry_point("classify")
graph.add_conditional_edges("classify", route)
graph.add_edge("handle_code", END)
graph.add_edge("handle_docs", END)
app = graph.compile(checkpointer=MemorySaver())
Every possible execution path is explicit in the graph definition. You can visualize, audit, and reason about agent behavior before running anything.
interrupt_before on any nodeComplex, stateful workflows with many conditional branches. Financial compliance agents. Multi-step data pipelines with approval gates. Anything where you need deterministic control flow with LLM decision points and an audit trail of every agent decision.
Latest version: 1.10.1 | GitHub: 44.6K stars | Downloads: 12M+ monthly
CrewAI uses a role-based metaphor. Instead of graphs, you define agents with roles, goals, and backstories, then organize them into crews that collaborate on tasks.
Three core concepts: Agents (with roles and tool access), Tasks (units of work assigned to agents), and Crews (the orchestration layer that manages execution). The framework supports sequential, hierarchical, and consensual process types. Native MCP support through crewai-tools[mcp] lets agents declare MCP servers inline. A2A protocol support enables cross-framework agent communication.
from crewai import Agent, Task, Crew
researcher = Agent(
role="Technical Researcher",
goal="Find accurate, up-to-date information on developer tools",
backstory="Senior developer advocate with deep knowledge of "
"the JavaScript ecosystem and AI tooling.",
llm="claude-sonnet-4",
)
writer = Agent(
role="Technical Writer",
goal="Turn research into clear, actionable content",
backstory="Former engineering lead who writes concise "
"documentation that developers actually read.",
llm="claude-sonnet-4",
)
research_task = Task(
description="Research {topic}. Focus on practical use cases "
"and current limitations.",
agent=researcher,
expected_output="A structured research summary with key findings",
)
writing_task = Task(
description="Write a developer guide based on the research. "
"Include code examples.",
agent=writer,
expected_output="A complete guide in markdown format",
context=[research_task],
)
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
verbose=True,
)
result = crew.kickoff(inputs={"topic": "MCP server development"})
The code reads like a job description. That is intentional. CrewAI optimizes for rapid prototyping and intuitive multi-agent coordination.
Team-based workflows where agents have distinct expertise. Content pipelines (researcher, writer, editor). Customer support triage with specialized handlers. Any workflow where the role metaphor naturally fits your domain.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
Latest version: AG2 0.4+ | GitHub: 50.6K stars
AutoGen implements conversational agent teams where agents interact through multi-turn conversations. The v0.4 rewrite (AG2) added an event-driven core, async-first execution, and pluggable orchestration strategies.
The primary coordination pattern is GroupChat: multiple agents in a shared conversation where a selector determines who speaks next. Agents debate, critique, and refine each other's outputs through dialogue. AG2 introduced pluggable selectors (round-robin, LLM-based, custom) and an event-driven messaging system.
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager
coder = AssistantAgent(
name="Coder",
system_message="You write clean Python code. "
"Always include type hints and docstrings.",
llm_config={"model": "gpt-4.1"},
)
reviewer = AssistantAgent(
name="Reviewer",
system_message="You review code for bugs, security issues, "
"and performance problems. Be specific.",
llm_config={"model": "gpt-4.1"},
)
user_proxy = UserProxyAgent(
name="User",
human_input_mode="NEVER",
code_execution_config={"work_dir": "output"},
)
group_chat = GroupChat(
agents=[user_proxy, coder, reviewer],
messages=[],
max_round=6,
speaker_selection_method="auto",
)
manager = GroupChatManager(
groupchat=group_chat,
llm_config={"model": "gpt-4.1"},
)
user_proxy.initiate_chat(
manager,
message="Write a Python function that validates email addresses "
"using regex, then review it for edge cases.",
)
The conversational approach is natural for iterative tasks: code review (one agent writes, another reviews), content generation (writer + editor + fact-checker), and data analysis (analyst + validator).
Code generation and review workflows. Research tasks where thoroughness matters more than speed. Content generation pipelines with multiple revision rounds. Offline, quality-sensitive workflows where agents need to iterate and critique each other's outputs.
Latest version: 0.1.48 | Languages: Python, TypeScript
Anthropic's Claude Agent SDK (formerly Claude Code SDK) takes a tool-use-first approach where agents are Claude models equipped with tools, including the ability to invoke other agents as tools. It uses the same engine that powers Claude Code.
The defining feature is native MCP (Model Context Protocol) integration. Custom tools are implemented as in-process MCP servers that run directly within your application - no separate processes or network hops. Hooks provide lifecycle control: before_tool_call, after_tool_call, on_error, letting you inject logging, validation, or human approval at any point. Extended thinking gives you visible chain-of-thought reasoning in the API response.
import { Agent, tool, createMCPServer } from "claude-agent-sdk";
const searchTool = tool({
name: "search_docs",
description: "Search the documentation for relevant pages",
parameters: {
query: { type: "string", description: "Search query" },
},
execute: async ({ query }) => {
const results = await searchIndex(query);
return results.map((r) => r.title).join("\n");
},
});
const agent = new Agent({
model: "claude-sonnet-4-20250514",
systemPrompt:
"You are a developer support agent. Search docs, " +
"then provide clear answers with code examples.",
tools: [searchTool],
hooks: {
beforeToolCall: async (toolName, args) => {
console.log(`Calling ${toolName}`, args);
},
},
});
const response = await agent.run(
"How do I set up authentication with Clerk in Next.js?"
);
from claude_agent_sdk import Agent, tool
@tool("search_docs", "Search documentation", {"query": str})
async def search_docs(args):
results = await search_index(args["query"])
return {"content": [{"type": "text", "text": "\n".join(r.title for r in results)}]}
agent = Agent(
model="claude-sonnet-4-20250514",
system_prompt="You are a developer support agent. Search docs, "
"then provide clear answers with code examples.",
tools=[search_docs],
)
response = await agent.run("How do I set up authentication with Clerk in Next.js?")
The architecture is deliberately simple: an agent loop, tools, and hooks. Anthropic relies on Claude's native capabilities for reasoning and coordination rather than adding framework abstractions.
Teams invested in the Anthropic ecosystem. Workflows requiring deep MCP integration with multiple tool servers. Agents that need lifecycle hooks for compliance and approval flows. Safety-critical applications in healthcare, finance, and legal. Projects already using Claude Code.
Latest version: 5.x | GitHub: 12K+ stars | npm: 2M+ weekly downloads
The Vercel AI SDK is the TypeScript-first option. It is not an agent framework in the traditional sense - it is a toolkit for building AI-powered applications that includes agent capabilities through its generateText function with maxSteps for multi-step tool use.
The SDK provides a unified interface across LLM providers (OpenAI, Anthropic, Google, Mistral, and more) with three core primitives: generateText for server-side generation, streamText for streaming responses, and useChat for React integration. Agent behavior comes from the maxSteps parameter, which creates a tool-use loop where the model can call tools and reason across multiple steps.
import { generateText, tool } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";
const result = await generateText({
model: anthropic("claude-sonnet-4-20250514"),
maxSteps: 5,
system:
"You are a developer support agent. Use the available tools " +
"to research questions, then provide clear answers.",
tools: {
searchDocs: tool({
description: "Search the documentation",
parameters: z.object({
query: z.string().describe("Search query"),
}),
execute: async ({ query }) => {
const results = await searchIndex(query);
return results.map((r) => `${r.title}: ${r.summary}`).join("\n");
},
}),
getCodeExample: tool({
description: "Fetch a code example by topic",
parameters: z.object({
topic: z.string().describe("The topic to find examples for"),
}),
execute: async ({ topic }) => {
return await fetchExample(topic);
},
}),
},
prompt: "How do I implement rate limiting in a Next.js API route?",
});
console.log(result.text);
console.log(`Steps taken: ${result.steps.length}`);
The SDK integrates seamlessly with React and Next.js through hooks like useChat and useCompletion, making it the natural choice for full-stack TypeScript applications.
useChat, useCompletion, and streaming hooks for UIFull-stack TypeScript applications with AI features. Next.js projects that need streaming chat, tool use, or multi-step reasoning. Teams that want model flexibility without framework lock-in. Situations where the agent is part of a larger web application rather than a standalone system.
| Feature | LangGraph | CrewAI | AutoGen/AG2 | Claude Agent SDK | Vercel AI SDK |
|---|---|---|---|---|---|
| Orchestration | Directed graph | Role-based crews | Conversational GroupChat | Tool-use loop + hooks | maxSteps tool loop |
| Language | Python (TS beta) | Python | Python | Python + TypeScript | TypeScript |
| Model Lock-in | None | None | None | Claude only | None |
| State Persistence | Built-in checkpointing | Task outputs | Conversation history | Via MCP servers | Manual |
| Learning Curve | High | Low | Medium | Medium | Low |
| Multi-Agent | Native (sub-graphs) | Native (crews) | Native (GroupChat) | Sub-agents as tools | Manual |
| MCP Support | Via LangChain | Native | Community | Native (first-class) | Via tools |
| Human-in-the-Loop | Built-in (interrupt) | Manual | Built-in | Hooks | Manual |
| Streaming | Per-node | Limited | Limited | Native | Native |
| Best For | Complex stateful workflows | Fast prototyping | Iterative refinement | MCP-heavy, safety-critical | Full-stack TS apps |
| GitHub Stars | 24.6K | 44.6K | 50.6K | Growing | 12K+ |
| Production Maturity | High | Medium | Medium | Alpha | High |
Here is the decision tree, simplified:
You need complex branching workflows with audit trails - Use LangGraph. The graph model gives you deterministic control, and checkpointing is non-negotiable for regulated industries.
You want the fastest path from idea to working prototype - Use CrewAI. Define roles, assign tasks, run the crew. You will have agents working in an afternoon.
Your agents need to iterate, debate, and refine - Use AutoGen/AG2. The conversational pattern is natural for code review, research, and content pipelines where quality comes from multiple revision rounds.
You are building with Claude and need MCP integration - Use the Claude Agent SDK. Native MCP, lifecycle hooks, and extended thinking make it the tightest integration with Anthropic's ecosystem.
You are building a TypeScript web app with AI features - Use the Vercel AI SDK. It is not trying to be a full agent framework. It is the best toolkit for adding AI capabilities to Next.js applications.
You need model flexibility across providers - Use LangGraph, CrewAI, or Vercel AI SDK. All three are model-agnostic.
You are not sure yet - Start with CrewAI or Vercel AI SDK (depending on your language). Both have the lowest barrier to entry. You can always migrate to LangGraph when you hit the limits.
Yes, and many production systems do. Common combinations:
The key insight is that these frameworks operate at different levels of abstraction. The Vercel AI SDK is a toolkit. CrewAI is a coordination layer. LangGraph is an orchestration engine. They are not mutually exclusive.
CrewAI has the lowest learning curve. You define agents with roles and goals, assign tasks, and run them. A working multi-agent system takes under 20 lines of code. For TypeScript developers, the Vercel AI SDK is the most accessible starting point since it uses familiar patterns like Zod schemas and async functions.
LangGraph, CrewAI, AutoGen, and the Vercel AI SDK all support multiple providers. You can route different tasks to different models - use Claude for reasoning-heavy steps, GPT for code generation, and a local model for classification. The Claude Agent SDK is the only framework here locked to a single provider.
No. If your application is a single model with a few tools, raw API calls or the Vercel AI SDK's generateText with tools is sufficient. Frameworks add value when you need multi-step orchestration, persistent state, error recovery, or multi-agent coordination. Do not add framework complexity until the problem demands it.
MCP (Model Context Protocol) is a standard for how AI models discover and use tools. Instead of each framework implementing its own tool format, MCP provides a universal interface. This means a tool built as an MCP server works across Claude Code, Cursor, VS Code, and any MCP-compatible framework. CrewAI and the Claude Agent SDK have native MCP support. LangGraph and AutoGen can consume MCP servers through adapters.
The Vercel AI SDK is TypeScript-native and the clear leader for TypeScript developers. The Claude Agent SDK has official TypeScript support. LangGraph has a beta TypeScript package. CrewAI and AutoGen are Python-only.
The cleanest migration path is to keep your tools framework-agnostic. Define tools as MCP servers or plain async functions, then swap the orchestration layer. If your tools are tightly coupled to a specific framework's abstractions, migration gets painful. Design for portability from the start.
LangGraph and the Vercel AI SDK are the most production-mature, with companies running them at scale. The OpenAI Agents SDK and Claude Agent SDK are production-capable but newer. CrewAI and AutoGen are widely used but have fewer production case studies at enterprise scale. Always evaluate checkpointing, error recovery, and observability for your specific use case.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
The TypeScript toolkit for building AI apps. Unified API across OpenAI, Anthropic, Google. Streaming, tool calling, stru...
View ToolMost popular LLM framework. 100K+ GitHub stars. Chains, RAG, vector stores, tool use. LangGraph adds stateful multi-agen...
View ToolMulti-agent orchestration framework. Define agents with roles, goals, and tools, then assign them tasks in a crew. Pytho...
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Configure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI AgentsInstall Claude Code, configure your first project, and start shipping code with AI in under 5 minutes.
Getting Started
Check out Zed here! https://zed.dev In this video, we dive into Zed, a robust open source code editor that has recently introduced the Agent Client Protocol. This new open standard allows...

Leveraging Anthropic's Subagent for Claude Code: A Step-by-Step Guide In this video, we explore Anthropic's newly released subagent feature for Cloud Code, which allows developers to create...

Check out Trae here! https://tinyurl.com/2f8rw4vm In this video, we dive into @Trae_ai a newly launched AI IDE packed with innovative features. I provide a comprehensive demonstration...
A step-by-step guide to building AI agents that actually work. Choose a framework, define tools, wire up the loop, and s...

AI agents use LLMs to complete multi-step tasks autonomously. Here is how they work and how to build them in TypeScript.
A practical guide to building AI agents with TypeScript using the Vercel AI SDK. Tool use, multi-step reasoning, and rea...