147 items
140 posts, 7 tools
Mistral releases Leanstral 1.5, an Apache-2.0 licensed 119B parameter model (6B active) for Lean 4 theorem proving that saturates miniF2F and achieves SOTA on FATE benchmarks.
Describe an app in plain language and get a working single-file build back with a live sandboxed preview. Revise it by talking to it, share it with a link, or download the file. Here is what single-file buys you, how revisions work, the honest limits, and what it costs.
A decision framework for 2026: MCP servers give an agent access to a live system, Agent Skills teach it how to do a task. Here is when to build each, when to build both, and the criteria that actually decide it, grounded in the MCP spec and Anthropic's skills docs.
A companion guide to the Nimbalyst video: an open-source visual workspace that runs Codex and Claude Code from your existing subscriptions, with a Kanban board, a planning workflow, and AI commits. Here is what it does and where it fits.
A companion guide to the Agents 101 video: a behind-the-scenes walkthrough of building and deploying AI agents fast on Vercel, the agentic infrastructure stack. Here is the map of what to learn and where to go next.
A fair, sourced comparison of the TTS APIs developers reach for in 2026: OpenAI, ElevenLabs, xAI Grok, and Cartesia. Quality vs latency vs price, streaming, voice cloning policies, and whether to route through an AI gateway or go direct.
A companion guide to the Codex Record & Replay video: OpenAI Codex can now record a recurring computer task and replay it as a reusable automation skill. Here is what the feature is and where it fits.
A companion guide to the GLM 5.2 video: an open-weight model positioned against GPT-5.5, walked through with benchmarks, pricing, and a live OpenCode demo. Here is what the video covers and where to go deeper.
The Godot Foundation has established a policy banning autonomous AI agent code and substantial AI-generated contributions, citing reviewer burnout and concerns about maintainer mentorship.
A companion guide to the GPT-5.5 video: OpenAI's newly released model rolling out to ChatGPT and Codex, reviewed through benchmarks, agent capabilities, context window, and pricing. Here is what the video covers and where to go deeper.
A companion guide to the Loop Engineering video: the shift from repeatedly prompting an LLM to building long-running loops, goals, and automations. Here is the core idea and where to go deeper.
A companion guide to the OpenAI Codex video: a tour of the Codex desktop app, its plan and goal modes, plugins, multi-agent workflows, and UI annotation. Here is what the video shows and where to go deeper.
Ngrok engineer Sam Rose ported 100,000 lines of Kubernetes to TypeScript, creating a browser-based cluster for educational use - with 2,059 tests proving it behaves like real k8s.
A hosted infinite canvas your headless AI agents drive over MCP. Any MCP-speaking agent - Claude Code, Codex, Cursor, or a script - creates HTML docs, images, and video on a live canvas, streamed in as it builds.
A new project proposes a graphical shell layer for SSH that turns remote servers into browsable desktops. The HN discussion digs into architecture choices, the terminology debate, and whether this solves a real problem.
LangChain's June LangSmith updates point to a practical agent-ops pattern: Fleet templates, on-call triage, computer use, Slack interrupts, MCP auth, traces, and eval progress all belong in one operator loop.
OpenAI's June 2026 API changelog looks like scattered platform plumbing. Read together, moderation scores, workload identity, Admin APIs, prompt-cache retention, container billing, and Secure MCP Tunnel are the pieces teams need to run agents with real controls.
Grok Build is xAI's agentic CLI with 8 parallel subagents, a plan-first workflow, and Arena Mode for competing outputs. Installation, pricing, real commands, and how it compares to Claude Code and Codex.
Bumblebee is Perplexity's open source scanner for detecting compromised packages, extensions, and MCP configs on developer machines. A read-only Go binary that checks npm, PyPI, Go modules, and 10+ ecosystems against exposure catalogs - without running any install scripts. Here is how to set it up and use it.
AI-assisted development generates PRs faster than humans can review them. Here are the tools that help - CodeRabbit, DeepSource, Greptile, and others compared on pricing, platform support, and security capabilities.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.