Topic
Essential developer tools - CLIs, editors, frameworks, and infrastructure.
146 resources - 139 posts, 7 tools

Describe an app in plain language and get a working single-file build back with a live sandboxed preview. Revise it by talking to it, share it with a link, or download the file. Here is what single-file buys you, how revisions work, the honest limits, and what it costs.

A decision framework for 2026: MCP servers give an agent access to a live system, Agent Skills teach it how to do a task. Here is when to build each, when to build both, and the criteria that actually decide it, grounded in the MCP spec and Anthropic's skills docs.

A companion guide to the Nimbalyst video: an open-source visual workspace that runs Codex and Claude Code from your existing subscriptions, with a Kanban board, a planning workflow, and AI commits. Here is what it does and where it fits.

A companion guide to the Agents 101 video: a behind-the-scenes walkthrough of building and deploying AI agents fast on Vercel, the agentic infrastructure stack. Here is the map of what to learn and where to go next.

A fair, sourced comparison of the TTS APIs developers reach for in 2026: OpenAI, ElevenLabs, xAI Grok, and Cartesia. Quality vs latency vs price, streaming, voice cloning policies, and whether to route through an AI gateway or go direct.

A companion guide to the Codex Record & Replay video: OpenAI Codex can now record a recurring computer task and replay it as a reusable automation skill. Here is what the feature is and where it fits.

A companion guide to the GLM 5.2 video: an open-weight model positioned against GPT-5.5, walked through with benchmarks, pricing, and a live OpenCode demo. Here is what the video covers and where to go deeper.

The Godot Foundation has established a policy banning autonomous AI agent code and substantial AI-generated contributions, citing reviewer burnout and concerns about maintainer mentorship.

A companion guide to the GPT-5.5 video: OpenAI's newly released model rolling out to ChatGPT and Codex, reviewed through benchmarks, agent capabilities, context window, and pricing. Here is what the video covers and where to go deeper.

A companion guide to the Loop Engineering video: the shift from repeatedly prompting an LLM to building long-running loops, goals, and automations. Here is the core idea and where to go deeper.

A companion guide to the OpenAI Codex video: a tour of the Codex desktop app, its plan and goal modes, plugins, multi-agent workflows, and UI annotation. Here is what the video shows and where to go deeper.

Ngrok engineer Sam Rose ported 100,000 lines of Kubernetes to TypeScript, creating a browser-based cluster for educational use - with 2,059 tests proving it behaves like real k8s.

A new project proposes a graphical shell layer for SSH that turns remote servers into browsable desktops. The HN discussion digs into architecture choices, the terminology debate, and whether this solves a real problem.

LangChain's June LangSmith updates point to a practical agent-ops pattern: Fleet templates, on-call triage, computer use, Slack interrupts, MCP auth, traces, and eval progress all belong in one operator loop.

OpenAI's June 2026 API changelog looks like scattered platform plumbing. Read together, moderation scores, workload identity, Admin APIs, prompt-cache retention, container billing, and Secure MCP Tunnel are the pieces teams need to run agents with real controls.

Grok Build is xAI's agentic CLI with 8 parallel subagents, a plan-first workflow, and Arena Mode for competing outputs. Installation, pricing, real commands, and how it compares to Claude Code and Codex.

Bumblebee is Perplexity's open source scanner for detecting compromised packages, extensions, and MCP configs on developer machines. A read-only Go binary that checks npm, PyPI, Go modules, and 10+ ecosystems against exposure catalogs - without running any install scripts. Here is how to set it up and use it.

AI-assisted development generates PRs faster than humans can review them. Here are the tools that help - CodeRabbit, DeepSource, Greptile, and others compared on pricing, platform support, and security capabilities.

A viral Hacker News thread about AI affordability points at the right problem, but developer teams need a more useful cost model: retries, cache misses, review time, routing, and failed loops.

Armin Ronacher's new essay explores the tension between letting AI agents loop autonomously and maintaining the engineering comprehension that makes software maintainable. The Hacker News discussion adds practical caveats worth reading.

Claude Tag is Anthropic's new Slack-based beta for Team and Enterprise users. The important shift is not chat convenience - it is shared agent identity, channel context, and team-visible work.

Envoy AI Gateway 1.0 is production-ready. The useful question for builders is when an Envoy-based LLM gateway beats direct SDK calls, LiteLLM, OpenRouter, or a hosted AI gateway.

F3 is trending on Hacker News as a research prototype for a future-proof columnar file format. The useful takeaway is not to replace Parquet tomorrow. It is that data files are starting to carry more of their own runtime contract.

GitHub's June Copilot updates point beyond autocomplete: CLI access, bring-your-own-key model routing, AI credit metrics, and external agent providers make Copilot a governed agent platform.

LangChain's rubrics for Deep Agents point at a practical agent pattern: self-correction works only when rubrics are versioned, executable, and sampled against human review.

A new layer is forming around Claude Code, Codex, Copilot CLI, and local memory tools: the local coding agent workspace. It is not the model. It is the bench where agents get supervised.

Oak is an early bet that AI coding agents need version control shaped around sessions, virtual workspaces, and token budgets. The idea is risky, but the pressure on Git workflows is real.

New role-confusion research explains why prompt injection keeps surviving better prompts. Models do not reliably perceive which text is instruction, tool output, user content, or their own reasoning.

A developer used OpenAI Codex to build a fully open-source WYSIWYG editor for TikZ figures. The technical approach and reception on Hacker News offer a useful case study in what agent-built software looks like when shipped.

A trending Codex SQLite WAL bug is a useful warning for every local coding agent: logs, disks, background processes, and telemetry paths need budgets too.

A Codex CLI SQLite logging bug showed how global TRACE logs can burn SSD write endurance. OpenAI has now merged fixes, but the incident is a useful local-agent operations lesson.

Oak rethinks version control for agentic workflows with virtual mounts, faster snapshots, and lower VCS-related token overhead. Here's what the HN community thinks about this Show HN.

As coding agents get easier to delegate to, the scarce resource shifts from code generation to review capacity, CI minutes, environment reliability, and merge discipline.

Codex can point at OpenAI-compatible model providers, local Ollama servers, and internal model proxies. Here is the practical config pattern, the sharp edges, and when to use it.

Hex's data-agent lab shows the practical eval pattern AI teams should copy: compare candidates against stable baselines, keep receipts, and judge changes by task behavior.

Cloudflare shipped wrangler deploy --temporary on June 19, 2026. AI agents can now deploy Workers, D1 databases, and KV stores without browser auth flows. Here is how it works.

The new wrangler deploy --temporary flag creates ephemeral Cloudflare accounts for AI agents. 60-minute deployments, no OAuth, no browser - just deploy and claim later.

Most developers only know .gitignore, but Git offers two other ignore mechanisms for local workflows and machine-wide patterns. Here's when to use each.

GitHub's Agent Finder discovers and invokes Claude, Codex, MCP servers, and skills automatically. Here is how the new ARD specification changes AI coding tool integration.

Auto-installing tree-sitter grammars, built-in markdown mode, window layout commands, and more - the upcoming Emacs release absorbs features that used to require external packages.

Stop the approval-fatigue prompts without going full YOLO mode. A hands-on guide to Claude Code's permission system - settings.json scopes, allow/deny/ask rules, tool specifiers, and the headless flags that actually matter.

At its Compile conference, Cursor announced Origin: a Git-compatible code hosting platform designed around AI agents as first-class users. Built on its Graphite acquisition, it promises agent-driven merge conflict resolution, stacked PRs, and MCP-extensible automation. Here is what was actually announced, what is still a waitlist promise, and why it matters for developers.

Epic Games open-sourced Lore, a centralized version control system designed for binary-heavy game projects. It uses Merkle trees, on-demand file hydration, and native chunked storage to handle terabyte-scale repos that Git struggles with.

On June 2, 2026, GitHub made the Copilot SDK generally available. It exposes the same agent runtime behind Copilot - planning, tool calls, file edits, streaming, MCP - across TypeScript, Python, Go, .NET, Rust, and Java. Here is what changed at GA and what it means for builders.

On June 16, 2026, Microsoft's Work IQ APIs reach general availability - a workplace intelligence layer that hands agents pre-assembled, permission-trimmed Microsoft 365 context instead of raw Graph calls. Here is what the four domains, three protocols, and consumption pricing mean for developers building enterprise agents.

Databricks open-sourced Omnigent, a meta-harness that sits above individual agent CLIs so your sessions, policies, and skills are not locked inside any single tool. Here is what it does, how to install it, and where it fits if you already run Claude Code and Codex.

The IETF published RFC 10008 defining a new HTTP QUERY method - GET with a request body. It is safe, idempotent, cacheable, and solves the longstanding problem of complex queries hitting URL length limits.

Cursor Automations lets AI agents run in the background based on triggers, not prompts. Here is how to set them up, configure triggers, and integrate into your workflow.

OpenRouter Fusion turns multi-model panels into an API feature. The useful lesson is not to run every prompt through more models. It is to define when a task deserves an expensive second opinion.

GitHub's latest agent workspace trend points at a boring but important primitive: agents need explicit filesystem contracts before they get more tools.

Kiro is AWS's new agentic IDE built on spec-driven development. Amazon Q Developer support ends April 2027. Here is what Kiro does differently and how to migrate.

Claude agents vs skills, untangled: agents are workers with their own context window, skills are instructions loaded on demand. Here is the decision table.

Auto mode replaces permission prompts with a background safety classifier - here is how the Shift+Tab cycle, hard_deny rules, and glob deny patterns actually fit together.

Claude Code dynamic workflows turn orchestration into a JavaScript script that runs up to 1,000 agents per run - here is how scripts, schemas, budgets, and resume actually work.

Anthropic's docs say the tokenizer introduced with Opus 4.7 can use up to 35% more tokens for the same text. Here is what that does to per-request cost, max_tokens, and cross-model comparisons.

Fable 5 long-running requests can run for many minutes per turn and hours per autonomous run. Here is how to configure client timeouts, streaming keepalive, batch polling, and background patterns so they actually finish.

Anthropic says persistent file-based memory improved Fable 5 three times more than it improved Opus 4.8. Here is the full memory tool setup - handlers, security, and context editing included.

Task budgets give Claude a token countdown for the whole agentic loop, so the model paces itself instead of discovering the limit when max_tokens truncates it. Here is how the beta works on Fable 5, what it does not enforce, and where it fits next to effort and the Usage API.

A verified directory of the frontier AI models in June 2026 - Claude Fable 5, GPT-5.5, GPT-5.4, Gemini 3.1 Pro, and DeepSeek V4 - with pricing checked against official docs.

How to use Claude Fable 5 across every access path: claude.ai plans through June 22, the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry, with setup effort and first-prompt tips.

An ops guide to managing a fleet of Claude agents: spawning patterns, worktree isolation, build gates, orphaned-agent failure modes, and OpenTelemetry monitoring.

Migrating off retired GPT models in 2026: the live retirement table, what maps to what, an eval-before-switch day plan, and when to jump providers.

Ultracode is two documented things: a prompt keyword that turns one task into a dynamic workflow, and an /effort setting that pairs xhigh reasoning with automatic orchestration. Here is exactly what the docs say.

Twelve documented Claude Fable 5 use patterns - agent orchestration, overnight runs, 1M-context refactors, effort tuning - each with a how-to seed and doc link.

Within hours of Dario Amodei publishing 'Policy on the AI Exponential,' critics surfaced across Hacker News and the tech press. We surveyed the actual reactions, characterized each fairly, and weighed which critiques matter most if they turn out to be right.

Apple shipped a LanguageModel protocol at WWDC 2026 that lets iOS and macOS developers swap between Claude, Gemini, and local models with a single dependency change. Here is what OS-level provider abstraction actually means for switching costs, moats, and your architecture decisions.

A practical comparison of the two most capable terminal-native AI coding agents in 2026 - covering pricing, model flexibility, multi-agent workflows, and which one fits your team.

Claude Desktop spawns a Hyper-V virtual machine consuming roughly 1.8 GB of RAM on every Windows launch - even when you only open it for chat. Here is what the VM is for, who gets hit hardest, and the workarounds that actually work.

Claude Managed Agents is in public beta with solid sandboxing and session persistence - but the headline orchestration features are still locked behind a research preview waitlist. Here's what teams can actually ship today, what it costs, and when DIY alternatives make more sense.

Fable 5 drains the 5-hour rolling window dramatically faster than Opus or Sonnet. Here is what the plan multipliers actually mean in practice, what changes on June 22, and how to make your allocation last.

The Codex changelog from April through June 2026 covers GPT-5.5, Goal mode going stable, Sites, a Chrome extension, Amazon Bedrock support, and mobile access from iOS. Here is what actually shipped and what it means in practice.

Anthropic shipped Fable 5 and a June 22 subscription cliff. OpenAI shipped GPT-5.5 inside Codex plus automations, browser use, and computer control. Here is the honest June 2026 update on which tool fits which developer.

Cursor's $50B valuation puts a developer tool above roughly 400 Fortune 500 companies. Here's a clear-eyed look at whether that valuation reflects reality - and which AI IDE actually fits your workflow in 2026.

Cursor and Devin Desktop have converged on similar pricing but diverged hard on philosophy. Here is what actually matters when picking one for your team in 2026.

On the same day Dario Amodei called for FAA-style mandatory testing of frontier AI, Anthropic shipped Fable 5 - the public face of Mythos - with classifier guardrails and a June 22 pricing window. Responsible disclosure or a live contradiction?

Claude Fable 5 routes blocked queries to Opus 4.8 rather than refusing outright - but the fallback is not automatic for API users and requires explicit configuration. Here is the complete developer guide to the refusal architecture.

Anthropic's Claude Fable 5 includes undisclosed interventions that silently degrade responses for certain ML development tasks - no fallback notice, no refusal, just worse answers.

Fable 5 posts an 80.3% SWE-Bench Pro score and costs 2x Opus 4.8 - here is the task-profile scoring guide that tells you when the premium pays off.

Factory AI's Droid agent surfaces a new competitive front in coding tools: cost-per-completed-task. Here's what their architecture reveals about where the whole industry is heading.

Factory Droid is a terminal-native AI coding agent with multi-model routing, headless CI execution, and browser automation built in. Here is everything you need to know to set it up and decide if it fits your workflow.

Moonshot AI's Kimi CLI offers unlimited coding sessions at zero marginal cost. Claude Code offers polish, deep Anthropic integration, and a subscription most serious devs already hold. Here is how to decide.

A hands-on look at Mastra, the open source TypeScript framework for building production-ready AI agents and workflows -- with verified setup commands, honest tradeoffs, and current pricing.

Windsurf is now Devin Desktop, owned by Cognition after a turbulent 2025 acquisition saga. If the ownership shuffle has you reconsidering your tooling, here is a step-by-step guide to moving your workflow to Claude Code.

A first-hand visit to DeepSeek HQ reveals something more interesting than benchmark scores: a 300-person company that treats AI as infrastructure, not eschatology - and what that means for API pricing everywhere.

A practical comparison of OpenAI's Agents SDK and Anthropic's Claude Agent SDK - orchestration models, tool ecosystems, sandboxing, and how to choose the right platform for your team.

OpenRouter gives you one API key for 300+ models, automatic fallbacks, and intelligent provider routing. Here is what it actually costs, how to set it up in five minutes, and when you should skip it entirely.

A practical comparison of LLM routing tools - LiteLLM, Portkey, and OpenRouter - covering cost management, fallbacks, caching, and when to use each for production AI applications.

The DevDigest blog is no longer just a folder of markdown files. It is becoming a small content operating system: posts, tags, RSS, search, llms.txt, route discovery, content expansion reports, and app-linked build logs.

A field note on adding pricing, Pro, apps, sponsors, partners, hiring, consulting, newsletter, and weekly rollup paths to DevDigest without turning the site into vague growth copy.

The DevDigest tools directory is not just a list of links. One registry now feeds tool pages, category filters, comparison routes, RSS, JSON APIs, search, sitemap discovery, and content expansion loops.

The AI coding market is noisy. The changes that matter are easier to spot when you separate model capability, editor loops, terminal agents, background agents, agent frameworks, UI layers, context, security, and cost.

If I were rebuilding my AI coding workflow on May 30, 2026, I would not pick one magic tool. I would pick a layered stack: terminal agent, editor, background agent, Mastra, CopilotKit, MCP, context, security, and cost controls.

May 2026 was not about one more coding model leaderboard. The useful signal was control planes, UI-agent contracts, durable TypeScript workflows, usage economics, and runtime security.

GitHub is suddenly full of codebase knowledge graph projects for Claude Code, Codex, Cursor, and other agents. The useful version is not a pretty graph. It is a map that changes planning, editing, and review.

The models.dev project is trending because AI teams need one boring source of truth for model specs, pricing, context windows, modalities, and tool support.

Runtime's Launch HN thread is a useful signal: teams do not just want isolated coding agents. They want a control plane for approvals, secrets, telemetry, review, and merge policy.

Anthropic's Stainless acquisition is not just an SDK deal. It is a bet that agents need generated SDKs, CLIs, docs, and MCP servers from the same source of truth.

Anthropic's June 15 Agent SDK credit split is not just a pricing tweak. It is a signal that autonomous coding workflows need separate budgets, lanes, and receipts.

Claude Code's newer plugin URL and hard-deny controls are small release-note items with a big implication: agent extensions now need supply-chain discipline.

Codex CLI 0.129.0 added modal Vim editing in the composer. The feature is small, but it points at a bigger shift: terminal agents are becoming native engineering workbenches.

Thinking Machines' interaction-models post points at a useful shift for developer tools: stop designing around single chat turns and start designing around shared work.

Graphify is trending because coding agents keep hitting the same wall: they can edit files, but they still need a durable map of how the codebase, docs, schemas, and decisions connect.

Claude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not just better agents. It is that agent runs need backend job discipline.

InsForge is trending because coding agents can scaffold UI faster than they can safely operate databases, auth, storage, functions, and deployments. The backend now needs an agent-readable control plane.

What if your dev tools weren't separate apps but one operating system? The thesis behind /os and /suites - small, sharp tools that compound into a coherent layer.

Terminal agents like Claude Code, Codex CLI, OpenCode, Copilot CLI, and DeepSeek-TUI are converging on the same runtime layer: permissions, sandboxing, rollback, diagnostics, subagents, receipts, and cost controls.

Cline is a free, open-source VS Code extension that brings autonomous AI coding to your editor. It works with local models or cloud APIs, handles multi-file changes, and runs terminal commands without proprietary lock-in.

Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries. Here is the practical playbook.

OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, automations, and repeatable knowledge work.

Google's skills repo is a useful signal: agents do not just need generic coding help. They need product-specific operating instructions that make docs executable.

The andrej-karpathy-skills repo exploded because every coding agent needs behavioral rails. The useful move is not copying it blindly, but turning the rules into repo-specific operating constraints.

A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, control tradeoffs, and migration guidance for long-running tasks.

Flue is trending because it names the part of agent infrastructure that is becoming product-critical: the programmable harness around the model.

GitHub Copilot is moving from autocomplete into asynchronous coding agents, terminal workflows, MCP, skills, and model choice. Here is what changed in 2026.

jcode is trending because it competes on a less glamorous but important agent metric: how cheap it is to keep many coding sessions alive.

Microsoft's lib0xc landed on Hacker News with a practical message: safer systems code often means better C APIs, warnings, bounds checks, and incremental adoption, not a heroic rewrite.

Most agent tool APIs are just REST endpoints with nicer names. Production agents need intent-shaped tools that compress workflows, reduce context, and return reviewable receipts.

Open Design is trending because it turns Claude Code, Codex, Cursor, Gemini, and other CLIs into a design engine. The useful lesson is not design automation. It is artifact-first agent wrappers.

A trending refusal-direction paper is a reminder that model safety cannot be treated as a thin refusal layer. Builders need layered controls around the model.

VS Code 1.118 makes Copilot a Git co-author by default for chat and agent commits. The argument is not really about one trailer line. It is about consent, audit signals, and who controls developer workflow metadata.

Agent runs are opaque. TraceTrail turns a Claude Code JSONL into a public share link with a stepped timeline of messages, tool calls, and tokens.

Claude Code hooks are powerful, but discovery and install still feel like manual JSON surgery. The Hookyard prototype shows what a hook package manager should become.

A curated directory of 312 Claude Code skills, plus Pro tools for authors who want analytics, version pinning, and a real submission flow.

The second half of our agent tooling release: distribution, validation, and ergonomics layered on top of the first six. Six small CLIs, one through-line.

Two quality-of-life tools we built this week for Claude Code daily drivers: a SKILL.md linter and a VS Code status bar that shows live LLM spend.

Multica is pushing the agent teammate pattern: assign issues, route work to local runtimes, stream progress, and compound skills. Here is the practical read for AI dev teams.

Most MCP servers are noise. After shipping 24 apps with Claude Code, these are the five I reach for every time.

From Claude Code to Gladia, the ten CLIs every AI-native developer should know. Install commands, trade-offs, and when to reach for each.

Four agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.

A practical breakdown of GitHub Copilot Pro and Pro+ in 2026, focused on premium request economics, the June 2026 move to AI Credits, and how to avoid request-burn surprises.

An opinionated guide to the MCP server ecosystem in 2026. Curated picks by category, real configuration examples, installation commands, and honest assessments of what works and what does not.
AI agent work needs local observability. OpenTelemetry, OTLP, Vercel AI SDK telemetry, and lightweight trace viewers give developers receipts for model calls, tool use, latency, errors, and cost before anything goes to production.

The AI coding market just passed 90% developer adoption. Here's what the data actually says about which tools are winning, what's shifting, and where this is all heading.

The creators of Ruff and uv are joining OpenAI. Here is what this means for the Python ecosystem, AI tooling, and why OpenAI is investing in developer infrastructure.

From terminal agents to cloud IDEs - these are the AI coding tools worth using for TypeScript development in 2026.

OpenClaw has 247K stars and zero MCPs. The best tools for AI agents aren't new protocols - they're the CLIs developers have used for decades.

Claude Code's popularity is not an accident. It won because terminal agents fit how software already works: files, shell commands, git, logs, project memory, and reviewable text.

OpenAI shipped a new feature in the ChatGPT macOS app that lets it read context from VS Code, Xcode, Terminal, and iTerm2. Here is how to set it up, what it can actually do today, and why the future of this feature matters more than the current version.

Cursor started as an open-source code editor and evolved into one of the most popular AI coding tools available. Here is a hands-on look at its key features, pricing tiers, and how it compares to traditional editors like VS Code.
Visual testing tool for Model Context Protocol servers. Like Postman for MCP - call tools, browse resources, and view real-time logs in a browser UI. Zero install via npx.
MCP ToolsLightweight CLI for discovering and calling MCP servers. Dynamic tool discovery reduces token consumption from 47K to 400 tokens. Three subcommands: info, grep, call.
MCP ToolsCentralized manager for MCP servers. Connect once to localhost:37373 and access all your servers through a single endpoint. REST API, web UI, and VS Code config compatible.
MCP ToolsRegistry and hosting platform for MCP servers. 6,000+ servers indexed. One-command install and configuration via CLI. Supports local and hosted deployments.
MCP ToolsLargest MCP server directory with 17,000+ servers. Security grading (A/B/C/F), compatibility scoring, and install configs. ChatGPT-like UI for browsing and testing.
MCP ToolsAI-powered terminal built in Rust with GPU rendering. Block-based output, natural language commands, Agent Mode for autonomous tasks. 700K+ developers. Free tier available.
ProductivityA hosted infinite canvas your headless AI agents drive over MCP. Any MCP-speaking agent - Claude Code, Codex, Cursor, or a script - creates HTML docs, images, and video on a live canvas, streamed in as it builds.
ProductivityKeep exploring

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Explore 659 topics
Browse All Topics