AI AGENTS

65 items

61 posts, 4 guides

BlogMay 20, 2026

Forge Shows the Local Agent Reliability Gap Is a Harness Problem

Forge hit the Hacker News front page with a strong claim: small local models can become much more useful at tool-calling when the harness catches structural failures, retries intelligently, and controls context.

AI Agents Local Models Developer Workflow Open Source

BlogMay 19, 2026

Anthropic Buying Stainless Is About Agent Plumbing

Anthropic's Stainless acquisition is not just an SDK deal. It is a bet that agents need generated SDKs, CLIs, docs, and MCP servers from the same source of truth.

Anthropic AI Agents SDKs MCP Developer Tools

BlogMay 13, 2026

Agent Memory Benchmarks Are Not Enough

Persistent memory for coding agents is trending because every session still starts too cold. The hard part is not saving facts. It is proving recall, freshness, deletion, and rollback under real development pressure.

AI Agents Context Engineering Claude Code Codex

BlogMay 12, 2026

Claude Platform on AWS Is Enterprise Agent Plumbing, Not Just Procurement

Claude Platform on AWS matters because it moves agent adoption into identity, billing, commitments, and platform controls. That is where enterprise AI work gets real.

Claude AWS AI Agents Enterprise AI Developer Workflow

BlogMay 12, 2026

Interaction Models Are the Next AI Developer Tool Interface

Thinking Machines' interaction-models post points at a useful shift for developer tools: stop designing around single chat turns and start designing around shared work.

AI Interfaces Developer Tools AI Agents UX Multimodal AI

BlogMay 12, 2026

TanStack's npm Compromise Is the CI Lesson Agent Teams Needed

The TanStack npm incident was not just a package-security story. It was a reminder that AI agent workflows inherit every weak trust boundary in CI.

Security AI Agents GitHub Actions Developer Workflow Supply Chain

BlogMay 9, 2026

Claude Managed Agents Are Starting to Look Like Backend Jobs

Claude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not just better agents. It is that agent runs need backend job discipline.

Claude AI Agents Developer Tools Backend Orchestration

BlogMay 6, 2026

How We Patched 100+ PRs Across Our App Empire in One Day

31 deployed apps. 7 down. Favicons missing on 20 of 24 reachable hosts. Sentry on zero. Here is how a single audit turned into 58 PRs in one afternoon - and what shipped, what didn't, and what the pattern was.

AI Agents Claude Code Orchestration DevOps Postmortem

BlogMay 6, 2026

219 PRs in One Day: A Parallel Agent Fan-Out Postmortem

Notes from a single session running 200+ Claude Code subagents in parallel across 35 repos. What worked, what broke, and the patterns I codified into a skill so the recipe replays.

AI Agents Claude Code Orchestration Agentic Coding Parallelism

BlogMay 5, 2026

Codex Automations: Where Scheduled AI Agents Actually Help

Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries. Here is the practical playbook.

Codex OpenAI AI Agents Automation Developer Tools

BlogMay 5, 2026

Codex Is Becoming a General-Purpose AI Agent, Not Just a Coding Tool

OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, automations, and repeatable knowledge work.

Codex OpenAI AI Agents Developer Tools Automation

BlogMay 5, 2026

Codex Loops: What Boris Cherny Gets Right About Managing Agent Work

Boris Cherny's loop-heavy Claude Code workflow points at the next Codex content lane: recurring agents that babysit PRs, CI, deploys, and feedback streams.

Codex AI Agents Claude Code Developer Workflow Automation

BlogMay 5, 2026

Karpathy's Loopy Era Is the Best Way to Understand Codex

Andrej Karpathy's loopy era frame explains why Codex is becoming less like a chatbot and more like an agent loop manager for real software work.

Codex AI Agents Agentic Engineering OpenAI Developer Workflow

BlogMay 2, 2026

The 98% Context Reduction Pattern

Efficient agents do not stuff every tool result into the model context. They keep intermediate state in code, files, and execution environments, then return compact summaries and receipts.

Context Engineering MCP AI Agents Claude Code

BlogMay 2, 2026

Approval Fatigue Is an Agent Security Bug

Manual approval prompts stop protecting users when coding agents ask too often. The better pattern is risk-aware autonomy: safe defaults, narrow deny rules, and approvals only for meaningful changes.

AI Agents Security Claude Code Developer Workflow

BlogMay 2, 2026

Claude Code Agent Teams, Subagents, and MCP: The 2026 Playbook

Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context fit together in 2026.

Claude Code Anthropic Subagents MCP AI Agents

BlogMay 2, 2026

Client-Side Tool Calling Is the Privacy Pattern AI Apps Need

A Show HN PDF form demo points at a bigger architecture shift: keep sensitive documents local, expose narrow browser tools to the model, and make AI assistance inspectable.

AI Agents Privacy Tool Calling Local AI Developer Architecture

BlogMay 2, 2026

Codex /goal and Claude Managed Outcomes: The New Control Loops

A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, control tradeoffs, and migration guidance for long-running tasks.

AI Agents OpenAI Claude Orchestration Managed Agents Developer Tools

BlogMay 2, 2026

Flue: The Agent Harness Framework and Why It Feels Different

A long-form technical read on Flue from Fred K Schott, with deeper comparisons against OpenAI Agents, Vercel AI SDK, Google ADK, LangChain, Deep Agents, and CrewAI, plus practical production patterns.

AI Agents TypeScript Developer Tooling Agent Frameworks Infrastructure

BlogMay 2, 2026

Long-Running Agents Need Harnesses, Not Hope

A long-running coding agent is only useful if the environment around it can queue tasks, capture logs, checkpoint state, verify behavior, limit cost, and recover from failure.

AI Agents Reliability Claude Code Developer Workflow

Page 1 of 4Next

Get Smarter About AI Dev

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.

One email per weekReal code, not theoryFree forever

Browse All Tags