5 items
4 posts, 1 tool
Forge hit the Hacker News front page with a strong claim: small local models can become much more useful at tool-calling when the harness catches structural failures, retries intelligently, and controls context.
The trending Free Claude Code repo is not just about avoiding API bills. It points at a bigger developer-tool pattern: model gateways for AI coding agents.
How KV caching speeds up LLM inference - the math, the code, the memory tradeoffs, and when it stops helping. Every dev running local models hits this wall.
Alibaba's newest Qwen release claims flagship-level coding in a 27B dense model. Here is why dense matters, where it fits against the 480B MoE coder, and what it unlocks for local inference.
Open-source AI code assistant for VS Code and JetBrains. Bring your own model - local or API. Tab autocomplete, chat, inline edit. Fully customizable.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.