204 items
198 posts, 2 tools, 4 guides
F3 is trending on Hacker News as a research prototype for a future-proof columnar file format. The useful takeaway is not to replace Parquet tomorrow. It is that data files are starting to carry more of their own runtime contract.
GitHub's June Copilot updates point beyond autocomplete: CLI access, bring-your-own-key model routing, AI credit metrics, and external agent providers make Copilot a governed agent platform.
LangChain's rubrics for Deep Agents point at a practical agent pattern: self-correction works only when rubrics are versioned, executable, and sampled against human review.
Mistral OCR 4 and Baidu's Unlimited OCR both hit Hacker News today. The useful takeaway for developers is that OCR is no longer just text extraction. It is becoming a runtime decision for document agents.
OpenAI's June deprecations put Agent Builder, hosted Evals, and reusable prompts on a November 30 shutdown path. Here is the practical migration plan: Agents SDK, repo-owned prompts, and eval receipts.
OpenMontage is trending because it treats video production like a repo-shaped agent workflow: scripts, assets, render pipelines, review loops, and coding agents working across the whole process.
New role-confusion research explains why prompt injection keeps surviving better prompts. Models do not reliably perceive which text is instruction, tool output, user content, or their own reasoning.
A developer used OpenAI Codex to build a fully open-source WYSIWYG editor for TikZ figures. The technical approach and reception on Hacker News offer a useful case study in what agent-built software looks like when shipped.
Microsoft merged AutoGen and Semantic Kernel into a single production-ready SDK. Here is everything developers need to know: architecture, installation, migration paths, pricing, and when to use it over LangGraph or CrewAI.
Oak rethinks version control for agentic workflows with virtual mounts, faster snapshots, and lower VCS-related token overhead. Here's what the HN community thinks about this Show HN.
Sakana says Fugu Ultra stands with Fable, Mythos, GPT-5.5, Gemini, and Opus by orchestrating models instead of being one giant model. Here is what the benchmarks show, what is novel, and what still needs proof.
Sakana Fugu Ultra is not just another giant model. It is a learned orchestration layer that routes work across expert models, matches frontier benchmark claims, and makes a serious case for multi-model AI systems.
The Bayer and Thoughtworks PRINCE case study is a useful reminder that reliable agentic AI comes from context routing, traces, evals, monitoring, and human review, not from a better prompt alone.
As coding agents get easier to delegate to, the scarce resource shifts from code generation to review capacity, CI minutes, environment reliability, and merge discipline.
Hex's data-agent lab shows the practical eval pattern AI teams should copy: compare candidates against stable baselines, keep receipts, and judge changes by task behavior.
Cloudflare shipped wrangler deploy --temporary on June 19, 2026. AI agents can now deploy Workers, D1 databases, and KV stores without browser auth flows. Here is how it works.
Goal, loop, routine. Three verbs, two tools, one hard part. A complete field guide to running agentic loops in Claude Code and Codex, the real commands, the patterns people actually run, and the two failure modes that burn money.
The MCP 2026-07-28 release candidate drops sessions entirely. Here is what changes, what breaks, and how to migrate your MCP servers before the July 28 deadline.
MCP's new enterprise-managed authorization flow is not just less login friction. It moves agent tool access into identity, policy, and audit systems enterprises already understand.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.