47 items
46 posts, 1 guide
Everything developers need to migrate from Sonnet 4.6 to Sonnet 5 - three breaking API changes, the new effort parameter, tokenizer impact, and when to use each effort level. Verified against Anthropic's official docs on July 4, 2026.
Claude Sonnet 5 lands near Opus 4.8 on some tasks for a fraction of the price - but a new tokenizer runs about 30 percent more tokens. Here is the upgrade decision for builders, with the numbers.
Standing up a fleet of Fable 5 agents is the easy part. This is the operations layer - data retention rules, refusal-rate alerting, effort tuning, observability, and availability planning - that keeps the fleet running.
Anthropic's most capable model launched, got suspended by a US export-control order, and returned today. Here is what Fable 5 is, what changed on the way back, and whether builders should reach for it.
The orchestrator is the most important model choice in an agent fleet. A fair head-to-head between Fable 5 and Opus 4.8 for that role, with a decision matrix by run length, budget, compliance, and refusal-handling tolerance.
A companion guide to the GLM 5.2 video: an open-weight model positioned against GPT-5.5, walked through with benchmarks, pricing, and a live OpenCode demo. Here is what the video covers and where to go deeper.
A companion guide to the GPT-5.5 video: OpenAI's newly released model rolling out to ChatGPT and Codex, reviewed through benchmarks, agent capabilities, context window, and pricing. Here is what the video covers and where to go deeper.
Anthropic releases Claude Sonnet 5 with improved agentic capabilities, better tool use, and an introductory pricing deal. Here's what developers need to know.
Switzerland's fully open foundation model promises transparent training data and EU compliance. The HN crowd has questions about actual performance.
Sakana says Fugu Ultra stands with Fable, Mythos, GPT-5.5, Gemini, and Opus by orchestrating models instead of being one giant model. Here is what the benchmarks show, what is novel, and what still needs proof.
Sakana Fugu makes a timely argument for model routing: frontier performance should come from swappable systems, not a hard dependency on one proprietary API.
Sakana Fugu Ultra is not just another giant model. It is a learned orchestration layer that routes work across expert models, matches frontier benchmark claims, and makes a serious case for multi-model AI systems.
Codex can point at OpenAI-compatible model providers, local Ollama servers, and internal model proxies. Here is the practical config pattern, the sharp edges, and when to use it.
No single model wins every task anymore, and the companies that never trained one - Factory, Devin, Perplexity, Cursor, OpenCode - are turning that into a moat. This is how model routing works, why open weights and neoclouds make it cheap, and the honest counter-argument.
Z.ai's GLM-5.2 lands as a 753B open-weights coding model that beats GPT-5.5 on SWE-bench Pro for roughly one-sixth the per-token cost. Here is the real cost math, a worked cost-per-task example, and a when-to-use-which decision guide.
A code-heavy field guide to model routing. Real, runnable-style configs for tiering tasks by complexity, routing simple work to open-weights, reserving frontier models for hard reasoning, building failover chains, and keeping prompt caches warm with OpenRouter, LiteLLM, and Factory Router.
OpenRouter Fusion turns multi-model panels into an API feature. The useful lesson is not to run every prompt through more models. It is to define when a task deserves an expensive second opinion.
Anthropic's docs say the tokenizer introduced with Opus 4.7 can use up to 35% more tokens for the same text. Here is what that does to per-request cost, max_tokens, and cross-model comparisons.
Fable 5 1M context workflows that actually work: whole-repo reviews, log archaeology, multi-doc synthesis - plus the honest math on when RAG still wins.
Fable 5 effort levels explained: what low, medium, high, xhigh, and max actually change, which models support each level, and how effort drives your token bill.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.