204 items
198 posts, 2 tools, 4 guides
The TanStack npm incident was not just a package-security story. It was a reminder that AI agent workflows inherit every weak trust boundary in CI.
Ruflo turns Claude Code and Codex into a larger agent harness with plugins, memory, swarms, MCP tools, and federation. The useful question is not the star count. It is how much harness you actually need.
Claude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not just better agents. It is that agent runs need backend job discipline.
DeepSeek-TUI is a Rust-built terminal coding agent wrapping the DeepSeek V4 API with full tool use, MCP server support, a composable skills system, and three operational modes for different risk tolerances.
31 deployed apps. 7 down. Favicons missing on 20 of 24 reachable hosts. Sentry on zero. Here is how a single audit turned into 58 PRs in one afternoon - and what shipped, what didn't, and what the pattern was.
Notes from a single session running 200+ Claude Code subagents in parallel across 35 repos. What worked, what broke, and the patterns I codified into a skill so the recipe replays.
Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries. Here is the practical playbook.
OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, automations, and repeatable knowledge work.
Boris Cherny's loop-heavy Claude Code workflow points at the next Codex content lane: recurring agents that babysit PRs, CI, deploys, and feedback streams.
Andrej Karpathy's loopy era frame explains why Codex is becoming less like a chatbot and more like an agent loop manager for real software work.
Efficient agents do not stuff every tool result into the model context. They keep intermediate state in code, files, and execution environments, then return compact summaries and receipts.
Manual approval prompts stop protecting users when coding agents ask too often. The better pattern is risk-aware autonomy: safe defaults, narrow deny rules, and approvals only for meaningful changes.
Claude Code is turning into an orchestration layer for agent teams. Here is how subagents, MCP, hooks, and long context fit together in 2026.
A Show HN PDF form demo points at a bigger architecture shift: keep sensitive documents local, expose narrow browser tools to the model, and make AI assistance inspectable.
A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, control tradeoffs, and migration guidance for long-running tasks.
A long-form technical read on Flue from Fred K Schott, with deeper comparisons against OpenAI Agents, Vercel AI SDK, Google ADK, LangChain, Deep Agents, and CrewAI, plus practical production patterns.
A long-running coding agent is only useful if the environment around it can queue tasks, capture logs, checkpoint state, verify behavior, limit cost, and recover from failure.
Most agent tool APIs are just REST endpoints with nicer names. Production agents need intent-shaped tools that compress workflows, reduce context, and return reviewable receipts.
Skills turn a general coding agent into a trained teammate by packaging runbooks, scripts, examples, and domain-specific judgment into reusable instructions.
Warp going open source is not just a terminal story. It is a signal that AI coding tools are shifting from chat UX toward agent operations, where planning, execution, review, and feedback loops live close to the shell.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.