Building and using AI agents - multi-agent systems, autonomous coding, and orchestration.
64 resources - 59 posts, 1 tool, 4 guides

Persistent memory for coding agents is trending because every session still starts too cold. The hard part is not saving facts. It is proving recall, freshness, deletion, and rollback under real development pressure.

Claude Platform on AWS matters because it moves agent adoption into identity, billing, commitments, and platform controls. That is where enterprise AI work gets real.

Thinking Machines' interaction-models post points at a useful shift for developer tools: stop designing around single chat turns and start designing around shared work.

The TanStack npm incident was not just a package-security story. It was a reminder that AI agent workflows inherit every weak trust boundary in CI.

Claude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not just better agents. It is that agent runs need backend job discipline.

31 deployed apps. 7 down. Favicons missing on 20 of 24 reachable hosts. Sentry on zero. Here is how a single audit turned into 58 PRs in one afternoon - and what shipped, what didn't, and what the pattern was.

Notes from a single session running 200+ Claude Code subagents in parallel across 35 repos. What worked, what broke, and the patterns I codified into a skill so the recipe replays.

Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries. Here is the practical playbook.

OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, automations, and repeatable knowledge work.

Boris Cherny's loop-heavy Claude Code workflow points at the next Codex content lane: recurring agents that babysit PRs, CI, deploys, and feedback streams.

Andrej Karpathy's loopy era frame explains why Codex is becoming less like a chatbot and more like an agent loop manager for real software work.

Efficient agents do not stuff every tool result into the model context. They keep intermediate state in code, files, and execution environments, then return compact summaries and receipts.

Manual approval prompts stop protecting users when coding agents ask too often. The better pattern is risk-aware autonomy: safe defaults, narrow deny rules, and approvals only for meaningful changes.

Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context fit together in 2026.

A Show HN PDF form demo points at a bigger architecture shift: keep sensitive documents local, expose narrow browser tools to the model, and make AI assistance inspectable.

A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, control tradeoffs, and migration guidance for long-running tasks.

A long-form technical read on Flue from Fred K Schott, with deeper comparisons against OpenAI Agents, Vercel AI SDK, Google ADK, LangChain, Deep Agents, and CrewAI, plus practical production patterns.

A long-running coding agent is only useful if the environment around it can queue tasks, capture logs, checkpoint state, verify behavior, limit cost, and recover from failure.

Most agent tool APIs are just REST endpoints with nicer names. Production agents need intent-shaped tools that compress workflows, reduce context, and return reviewable receipts.

Skills turn a general coding agent into a trained teammate by packaging runbooks, scripts, examples, and domain-specific judgment into reusable instructions.

Warp going open source is not just a terminal story. It is a signal that AI coding tools are shifting from chat UX toward agent operations, where planning, execution, review, and feedback loops live close to the shell.

I told an agent to improve the site every 10 minutes and went to sleep. Here is what 12 new repos, 60 PRs, and three goofs taught me about overnight orchestration.

A practical architecture for multi-step Claude agents. Loop patterns, state management, error recovery, and the production gotchas that turn a five-step demo into a 20 percent success rate at scale.

Build MCP servers that connect Claude to your databases, APIs, and tools. Architecture, TypeScript SDK code, debugging, and the production gaps the spec doesn't cover.

Master tool use in the Claude API. Schema design, retry logic, multi-step loops, and the failure modes that only show up at 10k calls a day.

Five worked examples showing how the new Developers Digest products plug into each other. Real agent filesystems, auto-snapshots, gated skill libraries, eval suites, and a recursive MCP host.

agentfs is filesystem-shaped storage for AI agents. Postgres-backed on Neon, no cold starts, no exec by design. Pay-only plans start at twenty dollars.

Ten private tools shipped overnight - observability, skills, hooks, prompts, and evals - aimed at the agent infrastructure gap small teams keep falling into.

The math of agent pipelines is brutal. 85% reliability per step compounds to about 20% at 10 steps. Here is why long chains collapse in production, and the six patterns the field has converged on to fight the decay.

From single-agent baselines to multi-level hierarchies, these are the seven patterns for wiring AI agents together in production. Each with a decision rule, an implementation sketch, and the tradeoffs that actually matter.

Five managed-agent providers, five pricing models, zero unified cost attribution. If you're running agents overnight, you need FinOps you don't have yet.

Four agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.

CLAUDE.md is the highest-leverage file in any Claude Code project. Here's what goes in one, what doesn't, and the patterns that actually ship.

Autocomplete wrote the line. Agents write the pull request. The shift from Copilot to Claude Code, Cursor Agent, and Devin - explained with links to the docs that prove every claim.

MCP is the USB-C of AI agents. What the Model Context Protocol is, why Anthropic built it, and how to install your first server in Claude Code or Cursor. Fact-checked against the official MCP spec.

Claude Code is Anthropic's AI coding agent for your terminal. What it does, how it works, how it compares to Cursor and Codex, and how to ship your first feature with it. Fact-checked against official docs.

A practical security playbook for running Codex cloud tasks safely in 2026 using OpenAI docs: internet access controls, domain allowlists, HTTP method limits, and review workflows.

Hacker News keeps arguing about Claude Code, Codex, skills, MCP, and orchestration. Under the noise, the same four truths keep surfacing: workflows matter more than demos, verification is the bottleneck, skills beat prompts, and orchestration matters more than raw autonomy.

How to use AI agents to plan, scaffold, build, test, and deploy a SaaS product. Parallel development patterns, real workflow examples, and the operational details that determine whether your AI-assisted build succeeds or fails.

Context engineering is the practice of designing the persistent information that surrounds every AI interaction. CLAUDE.md files, system prompts, skill libraries, and memory systems. It is the single highest-leverage skill for developers working with AI agents in 2026.

Production-tested patterns for orchestrating AI agent teams - from fan-out parallelism to hierarchical delegation. Covers CrewAI, LangGraph, AutoGen, OpenAI Agents SDK, Google ADK, and custom approaches with real code.

AI agents that reflect on failures, accumulate skills, and get better with every session. Reflection patterns, memory architectures, skill extraction, and working code examples for building agents that actually learn.

Agents forget everything between sessions. Here are the patterns that fix that: CLAUDE.md persistence, RAG retrieval, context compression, and conversation summarization.

AI agents fail in ways traditional debugging cannot catch. Here are the tools and patterns for finding and fixing broken agent loops, tool failures, and context issues.

A practical comparison of the five major AI agent frameworks in 2026 - architecture, code examples, and a decision matrix to help you pick the right one.

AI agent skills are not just for developers. Here is how 12 professions use packaged AI workflows to do better knowledge work.
A step-by-step guide to building AI agents that actually work. Choose a framework, define tools, wire up the loop, and ship something real.
How to spec agent tasks that run overnight and wake up to verified, reviewable code. The spec format, pipeline, and review workflow.

AI agents use LLMs to complete multi-step tasks autonomously. Here is how they work and how to build them in TypeScript.

A practical guide to building AI agents with TypeScript using the Vercel AI SDK. Tool use, multi-step reasoning, and real patterns you can ship today.

From swarms to pipelines - here are the patterns for coordinating multiple AI agents in TypeScript applications.

AI coding agents are submitting pull requests to open source repos - and some CONTRIBUTING.md files now contain prompt injections targeting them.

MCP lets AI agents connect to databases, APIs, and tools. Here is what it is and how to use it in your TypeScript projects.

OpenClaw has 247K stars and zero MCPs. The best tools for AI agents aren't new protocols - they're the CLIs developers have used for decades.

OpenAI released their Agents SDK for TypeScript with first-class support for tool calling, structured outputs, multi-agent coordination, streaming, and human-in-the-loop approvals. Here is how each piece works.

OpenAI's Deep Research is an AI agent inside ChatGPT that plans and executes multi-step research workflows, browsing dozens of websites and producing cited reports in minutes instead of hours.

OpenAI added scheduled tasks and reminders to ChatGPT, turning it from a chat interface into something closer to a personal AI agent. Here is how it works, what it can do today, and where this is heading.

Google's Gemini Advanced includes a deep research feature that searches dozens of websites, verifies information across multiple sources, and generates detailed cited reports. Here is how it works and how it compares to other AI research tools.

Wire a Python LangGraph agent into a Next.js frontend using CopilotKit's co-agent architecture. Full walkthrough covering the graph, search nodes, streaming state, and the React UI.
Configure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
GuideWhat MCP servers are, how they work, and how to build your own in 5 minutes.
GuideStep-by-step guide to building an MCP server in TypeScript - from project setup to tool definitions, resource handling, testing, and deployment.
GuideDeep comparison of the top AI agent frameworks - architecture, code examples, strengths, weaknesses, and when to use each one.
Guide
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Explore 351 topics
Browse All Topics