9 items
3 posts, 5 tools, 1 guide
DeepSeek V4 splits into Flash and Pro, ships a 1M context window, and undercuts every closed model on price. Here's how to wire it up with the OpenAI SDK, when to pick it over Claude or GPT, and what changed since V3 and R1.
A hands-on developer guide to Mercury 2 from Inception Labs. OpenAI-compatible API, reasoning levels, tool use, structured outputs, and when a diffusion LLM beats an autoregressive one in real apps.
OpenAI is sunsetting the Assistants API in 2026. Here is a tested migration plan to the Responses API - code, state, threads, tools, every cliff I hit, in order.
Managed scheduling on Anthropic infrastructure with API and GitHub triggers.
Run 50,000+ ML models with a simple API. No infrastructure management. Pay-per-second billing. Deploy custom models with Cog. Popular for image generation and audio.
Fastest inference for open-source models. 200+ models via unified API. Ranks #1 on speed benchmarks for DeepSeek, Qwen, Kimi, and Llama. Serverless pay-per-token pricing.
Wafer-scale AI inference at 3,000+ tokens/sec. The WSE-3 chip has 4 trillion transistors and 900K AI cores. 20x faster than GPU providers. OpenAI partnership for inference.
Unified API for 200+ models. One API key, one billing dashboard. OpenAI, Anthropic, Google, Meta, Mistral, and more. Automatic fallbacks and load balancing.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.