
TL;DR
Thinking Machines' interaction-models post points at a useful shift for developer tools: stop designing around single chat turns and start designing around shared work.
Read next
Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries. Here is the practical playbook.
9 min readOpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, automations, and repeatable knowledge work.
8 min readClaude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not just better agents. It is that agent runs need backend job discipline.
9 min readThinking Machines' post on interaction models is one of the more useful AI interface pieces to land this week because it names a problem every developer-tool team is running into: chat is not the final shape.
Turn-based chat is great for asking a question. It is awkward for shared work.
Coding agents already proved that. A serious agent session is not one prompt and one answer. It is a loop of reading files, asking clarifying questions, editing code, running tests, showing diffs, getting corrected, opening browser checks, and leaving a receipt. That is why terminal agents are becoming runtime surfaces, why Codex loops matter, and why long-running agent harnesses keep showing up.
The next interface layer is not "better chat." It is better coordination.
Thinking Machines describes interaction models as systems that handle multimodal, real-time collaboration across audio, video, and text. The important idea is not merely multimodality. The important idea is that the model participates in an ongoing interaction instead of waiting for a fully packaged prompt.
For developer tools, that maps cleanly to the work we already do:
That is a different product shape from a chat box glued beside an editor.
Chat forces developers to serialize messy work into text.
You have to explain:
A good coding agent can infer some of that from the repo, but the interface still makes the human do too much packaging.
This is why tools keep adding richer surfaces: IDE diffs, terminal execution, browser screenshots, task plans, subagents, worktrees, PR comments, and persisted instructions. They are not decorations. They are attempts to escape the limitations of pure chat.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
In developer tools, an interaction model should treat the repo, terminal, browser, issue tracker, and human as parts of one workspace.
Imagine a coding agent interface where:
That is not science fiction. Pieces of it already exist across Claude Code, Codex, Cursor, Zed, GitHub Copilot, and browser automation workflows. The problem is that the pieces are still fragmented.
There is a fair counterargument: chat is simple, universal, and composable. A text box can drive anything. Developers already understand it. APIs are easier. Logs are easier. Automation is easier.
I agree with the first half. Chat should not disappear.
But chat should become one control among many, not the whole interface. The same way command lines did not disappear when IDEs improved, text prompts will remain useful. They just should not be responsible for carrying every bit of state.
The best developer tools will support text, but they will not force every interaction through text.
The real prize is shared state.
Developer work has a lot of state:
Chat transcripts are a poor database for that. They are verbose, ambiguous, and hard to resume. A better interaction model should store task state explicitly.
That is why agent context reduction matters. The goal is not to stuff more transcript into a context window. The goal is to keep the right state in the right structure.
If you are building AI developer tools, do not wait for a perfect multimodal model to improve the interface. Start with the interaction contract.
Add these primitives:
Those primitives make any model better because they reduce ambiguity.
The same idea applies outside code. A content automation should not only say "write a post." It should know:
That is exactly the loop behind skills as agent operating systems. A skill is a tiny interaction model: state, constraints, tools, and expected output.
Interaction models are a useful frame because they push AI tools beyond prompt-response thinking.
For developer tools, the future interface is a shared workspace where the model can coordinate across code, tests, browser state, voice, screenshots, issues, and deployment receipts.
Chat will still be there. It just will not be the whole product.
The best agent tools will feel less like asking a chatbot to code and more like working inside a system that understands the work in progress.
An interaction model is a system design for how a model collaborates with users across time, modalities, and shared state. Instead of treating every request as a standalone chat turn, it handles ongoing work.
Coding work involves files, diffs, tests, terminals, screenshots, issue trackers, and deployment checks. A chat-only interface makes developers compress all of that state into text, which is inefficient and error-prone.
No. Text prompts remain useful. The shift is that chat becomes one input inside a richer workspace, not the entire interface.
Sources: Thinking Machines: Interaction Models, Hacker News discussion, Anthropic Claude Code overview, OpenAI Codex documentation, W3C Multimodal Interaction Architecture.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Gives AI agents access to 250+ external tools (GitHub, Slack, Gmail, databases) with managed OAuth. Handles the auth and...
View ToolLightweight Python framework for multi-agent systems. Agent handoffs, tool use, guardrails, tracing. Successor to the ex...
View ToolGoogle's frontier model family. Gemini 2.5 Pro has 1M token context and top-tier coding benchmarks. Gemini 3 Pro pushes...
View ToolDeepSeek's reasoning-first model built for agents. First model to integrate thinking directly into tool use. Ships along...
View ToolRoute prompts to the right model based on cost, latency, and priority rules.
View AppEvery coding agent in one window. Stop alt-tabbing between Claude, Codex, and Cursor.
View AppSee exactly what your agent did, locally. No cloud, no signup.
View AppFilesystem and network isolation for Bash tool calls on Linux and macOS.
Claude CodeInstall Ollama and LM Studio, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting StartedStep-by-step guide to building an MCP server in TypeScript - from project setup to tool definitions, resource handling, testing, and deployment.
AI Agents
Claude Managed Agents now have multiagent sessions, outcomes, webhooks, and vault events. The practical takeaway is not...

Codex automations are useful when recurring engineering work has clear inputs, reviewable outputs, and safe boundaries....

OpenAI is turning Codex from a coding assistant into a broader agent workspace for files, apps, browser QA, images, auto...

A deep comparison of Codex's new /goal loop and Claude managed agents outcomes, with practical workflow examples, contro...

Most agent tool APIs are just REST endpoints with nicer names. Production agents need intent-shaped tools that compress...

Four agents, same tasks. Honest trade-offs from a developer shipping production apps with all of them.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.