123 items
122 posts, 1 tool
SWE-Bench has an 81% false-positive problem. FrontierCode replaces it with mergeability as the metric - and the scores are sobering for every AI coding tool on the market.
Running multiple Claude Code agents on the same repo causes branch collisions and stash chaos - git worktrees fix this by giving each agent its own isolated directory while sharing one Git history.
GitHub Copilot switched to AI Credits billing on June 1 - here is what the change means for your team's budget, how Copilot Max fits in, and how costs compare to Claude Code and Codex.
Microsoft unveiled seven in-house MAI models at Build 2026, including MAI-Code-1-Flash now shipping in GitHub Copilot. Here is what the MoE architecture, training data, and Copilot rollout mean for your team's toolchain decisions in H2 2026.
Windsurf is now Devin Desktop, owned by Cognition after a turbulent 2025 acquisition saga. If the ownership shuffle has you reconsidering your tooling, here is a step-by-step guide to moving your workflow to Claude Code.
A Hacker News thread on config files that run code points at the next AI coding risk: agent hooks, skills, and editor rules need review like executable dependencies.
OpenAI's harness engineering post and new token-use research point to the same lesson: agentic coding teams need token budgets, receipts, and eval loops, not vibes.
The rsync Claude debate shows why teams need reproducible defect forensics before AI attribution becomes a public blame machine.
Microsoft's new in-house coding model matters less as a benchmark headline and more as a signal that Copilot is becoming a routing layer for cost, latency, ownership, and review quality.
GitHub Trending is full of agent memory and context tools. The useful version is not magic recall. It is a context ledger: source-linked, scoped, expiring memory that agents can inspect and users can audit.
A huge Hacker News thread says domain expertise is the real moat in agentic coding. The sharper version: tacit judgment only compounds when you turn it into examples, tests, DSLs, and review gates.
The DevDigest blog is no longer just a folder of markdown files. It is becoming a small content operating system: posts, tags, RSS, search, llms.txt, route discovery, content expansion reports, and app-linked build logs.
The DevDigest tools directory is not just a list of links. One registry now feeds tool pages, category filters, comparison routes, RSS, JSON APIs, search, sitemap discovery, and content expansion loops.
The AI coding market is noisy. The changes that matter are easier to spot when you separate model capability, editor loops, terminal agents, background agents, agent frameworks, UI layers, context, security, and cost.
If I were rebuilding my AI coding workflow on May 30, 2026, I would not pick one magic tool. I would pick a layered stack: terminal agent, editor, background agent, Mastra, CopilotKit, MCP, context, security, and cost controls.
AI coding agents become safer when permissions, logs, and rollback are designed as one system. Here is the operating loop I would put around any agent that can edit code, run tools, or open pull requests.
May 2026 was not about one more coding model leaderboard. The useful signal was control planes, UI-agent contracts, durable TypeScript workflows, usage economics, and runtime security.
GitHub trending is full of anti-slop, taste, and compound-engineering skills. The real signal is not that agents need more prompts. It is that teams are trying to make subjective review criteria executable.
Claude Opus 4.8 looks like a benchmark bump, but the developer story is better honesty, dynamic workflows, and effort controls that make long-running agent work easier to review.
CodeGraph shows why coding agents need a local, queryable repo map. The win is not magic token savings. It is faster orientation, fewer wrong files, and better review receipts.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.