AI MODELS

47 items

46 posts, 1 guide

BlogJul 4, 2026

Claude Sonnet 5 Developer Guide: Migration, API, and Effort Levels

Everything developers need to migrate from Sonnet 4.6 to Sonnet 5 - three breaking API changes, the new effort parameter, tokenizer impact, and when to use each effort level. Verified against Anthropic's official docs on July 4, 2026.

Claude Anthropic AI Models Developer Guide

BlogJul 1, 2026

Claude Sonnet 5 vs Sonnet 4.6: Should You Upgrade?

Claude Sonnet 5 lands near Opus 4.8 on some tasks for a fraction of the price - but a new tokenizer runs about 30 percent more tokens. Here is the upgrade decision for builders, with the numbers.

Claude Sonnet 5 Anthropic AI Models

BlogJul 1, 2026

Running Fable 5 Agent Fleets in Production: The Operations Guide

Standing up a fleet of Fable 5 agents is the easy part. This is the operations layer - data retention rules, refusal-rate alerting, effort tuning, observability, and availability planning - that keeps the fleet running.

Fable 5 AI Agents Anthropic AI Models

BlogJul 1, 2026

Fable 5 Is Back: The Anthropic Model the Government Switched Off

Anthropic's most capable model launched, got suspended by a US export-control order, and returned today. Here is what Fable 5 is, what changed on the way back, and whether builders should reach for it.

Fable 5 Anthropic Claude AI Models

BlogJul 1, 2026

Fable 5 vs Opus 4.8: Which Should Orchestrate Your Agents?

The orchestrator is the most important model choice in an agent fleet. A fair head-to-head between Fable 5 and Opus 4.8 for that role, with a decision matrix by run length, budget, compliance, and refusal-handling tolerance.

Fable 5 AI Agents Anthropic AI Models

BlogJul 1, 2026

GLM 5.2 in 9 Minutes: The Open-Weight Rival to GPT-5.5

A companion guide to the GLM 5.2 video: an open-weight model positioned against GPT-5.5, walked through with benchmarks, pricing, and a live OpenCode demo. Here is what the video covers and where to go deeper.

glm open-weight-models opencode ai-models developer-tools

BlogJul 1, 2026

GPT-5.5 in 7 Minutes: Benchmarks, Codex Agents, Context Window, and Pricing

A companion guide to the GPT-5.5 video: OpenAI's newly released model rolling out to ChatGPT and Codex, reviewed through benchmarks, agent capabilities, context window, and pricing. Here is what the video covers and where to go deeper.

gpt-5-5 openai codex ai-models developer-tools

BlogJun 30, 2026

Claude Sonnet 5 Launch Analysis: The Most Agentic Sonnet Yet

Anthropic releases Claude Sonnet 5 with improved agentic capabilities, better tool use, and an introductory pricing deal. Here's what developers need to know.

News Hacker News Claude Anthropic AI Models

BlogJun 22, 2026

Apertus: Europe's Answer to AI Sovereignty - and Why HN Is Skeptical

Switzerland's fully open foundation model promises transparent training data and EU compliance. The HN crowd has questions about actual performance.

News Hacker News AI Models Open Source LLMs

BlogJun 22, 2026

Fugu Ultra's Frontier Performance Claim, Explained Without the Hype

Sakana says Fugu Ultra stands with Fable, Mythos, GPT-5.5, Gemini, and Opus by orchestrating models instead of being one giant model. Here is what the benchmarks show, what is novel, and what still needs proof.

ai-benchmarks ai-models model-routing ai-agents

BlogJun 22, 2026

Sakana Fugu and the Case for Not Betting Everything on One Proprietary Model

Sakana Fugu makes a timely argument for model routing: frontier performance should come from swappable systems, not a hard dependency on one proprietary API.

model-routing ai-infrastructure ai-models vendor-lock-in

BlogJun 22, 2026

Sakana Fugu Ultra: The Model Router Making the Frontier Look Less Proprietary

Sakana Fugu Ultra is not just another giant model. It is a learned orchestration layer that routes work across expert models, matches frontier benchmark claims, and makes a serious case for multi-model AI systems.

ai-models model-routing ai-agents open-models

BlogJun 21, 2026

How to Use GLM 5.2 and Other Custom Model Providers in Codex

Codex can point at OpenAI-compatible model providers, local Ollama servers, and internal model proxies. Here is the practical config pattern, the sharp edges, and when to use it.

Codex AI Coding Developer Tools AI Models Configuration

BlogJun 20, 2026

The Router Era: Why Not Owning a Frontier Model Became an Advantage

No single model wins every task anymore, and the companies that never trained one - Factory, Devin, Perplexity, Cursor, OpenCode - are turning that into a moat. This is how model routing works, why open weights and neoclouds make it cheap, and the honest counter-argument.

ai-models model-routing ai-coding-tools open-weights agents

BlogJun 17, 2026

GLM-5.2 Cost Math: When Open-Weights Coding Models Actually Save You Money

Z.ai's GLM-5.2 lands as a 753B open-weights coding model that beats GPT-5.5 on SWE-bench Pro for roughly one-sixth the per-token cost. Here is the real cost math, a worked cost-per-task example, and a when-to-use-which decision guide.

pricing ai-models open-weights glm ai-coding-tools

BlogJun 17, 2026

Model Routing Recipes: Practical Config Patterns to Cut AI Spend

A code-heavy field guide to model routing. Real, runnable-style configs for tiering tasks by complexity, routing simple work to open-weights, reserving frontier models for hard reasoning, building failover chains, and keeping prompt caches warm with OpenRouter, LiteLLM, and Factory Router.

pricing orchestration ai-models litellm openrouter

BlogJun 15, 2026

OpenRouter Fusion Makes Model Panels Real. Use Them Like Escalation, Not Autopilot

OpenRouter Fusion turns multi-model panels into an API feature. The useful lesson is not to run every prompt through more models. It is to define when a task deserves an expensive second opinion.

OpenRouter AI Models Model Routing Developer Tools AI Infrastructure

BlogJun 11, 2026

The Claude Tokenizer Change: What ~30% More Tokens Means for Your Bill

Anthropic's docs say the tokenizer introduced with Opus 4.7 can use up to 35% more tokens for the same text. Here is what that does to per-request cost, max_tokens, and cross-model comparisons.

Anthropic AI Models LLMs Developer Tools

BlogJun 11, 2026

Fable 5 with 1M Context: What Actually Works in Practice

Fable 5 1M context workflows that actually work: whole-repo reviews, log archaeology, multi-doc synthesis - plus the honest math on when RAG still wins.

AI Models Anthropic Context Engineering LLMs

BlogJun 11, 2026

Fable 5 Effort Levels Explained: low to xhigh, and What They Cost You

Fable 5 effort levels explained: what low, medium, high, xhigh, and max actually change, which models support each level, and how effort drives your token bill.

Anthropic AI Models Claude Code LLMs

Page 1 of 3Next

Get Smarter About AI Dev

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.

One email per weekReal code, not theoryFree forever

Browse All Tags