Sources#
Summary#
Thinking Machines Lab's framing of why current AI interfaces limit collaboration: the turn-based interface is a bandwidth bottleneck between human and model. It is the problem Interaction Models are built to dissolve.
The two-claim argument#
-
AI labs over-optimize for autonomy. Labs treat autonomous capability as the model's most important property; as a result, today's models and interfaces "aren't optimized for humans to remain in the loop." But in most real work users can't fully specify requirements upfront and walk away — good results come from a collaborative loop of clarification and feedback.
-
Humans get pushed out by the interface, not the work. "Humans increasingly get pushed out not because the work doesn't need them, but because the interface has no room for them." The fix is to let people collaborate with AI the way they collaborate with other people: messaging, talking, listening, seeing, showing, interjecting — and the model doing the same.
The mechanism: a single thread#
Today's models "experience reality in a single thread":
- Until the user finishes typing/speaking, the model waits with no perception of what the user is doing or how.
- Until the model finishes generating, its perception is frozen — no new information arrives until it finishes or is interrupted.
This narrow channel limits how much of a person's knowledge, intent, and judgement can reach the model, and how much of the model's work is legible to the human. Analogy: "trying to resolve a crucial disagreement over email rather than in person."
Why harnesses don't fix it#
Existing real-time systems bolt interactivity on with a harness — VAD (voice-activity detection), turn-boundary prediction, dialog state machines — components "meaningfully less intelligent than the model itself." That harness precludes whole interaction modes:
- proactive interjection ("interrupt when I say something wrong")
- reaction to visual cues ("tell me when I've written a bug in my code")
- speak-while-listening ("translate Spanish→English live")
- speak-while-watching ("live-commentate this sports game")
The Bitter Lesson says these hand-crafted systems get outpaced by general capability growth → the resolution is to make interactivity model-native (see Time-Aligned Micro-Turns).
Connections#
- Interaction Models — the proposed resolution
- The Bitter Lesson — why the harness-based status quo loses
- Time-Aligned Micro-Turns — the architectural move that removes turn boundaries
- Full-Duplex Interaction — the interaction modes the bottleneck currently blocks
- Harness Shrinkage as Models Improve — the general version of "the less-intelligent harness should dissolve into the model"
- AI Employee Framing / Human-AI Accountability Redesign — the org-side mirror: both warn against treating autonomy as the goal and pushing the human to the margin; this page is the interface-side version of the same critique
- Design Concept Grilling — argues the value is in collaborative iteration; this page argues the interface is what blocks it
- Context Window Smart Zone — orthogonal limitation that also makes "fully autonomous, walk away" brittle
Sources#
8 articles link here
- ConceptAI Employee Framing
Kropp et al. (HBR May 2026, n=1,261): framing AI agents as "employees" vs "tools" cuts personal accountability −9pp, in…
- EssayOpinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- ConceptDesign Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- ConceptFull-Duplex Interaction
Perceive-and-respond simultaneously across modalities; proactive interjection, visual-cue reactions, simultaneous speec…
- ConceptInteraction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
- ConceptThe Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
- EntityThinking Machines Lab
AI research lab behind interaction models (May 2026); harness-dissolves-into-model thesis; upstreamed streaming-session…
- ConceptTime-Aligned Micro-Turns
The core interaction-model move: input/output as continuous streams in ~200ms interleaved chunks, no turn boundaries; s…
Related articles
- ConceptInteraction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
- ConceptAgent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- EssayOpinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- ConceptHarness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- ConceptEncoder-Free Early Fusion
Multimodal design with minimal pre-processing instead of large standalone encoders: dMel audio embedding, 40×40-patch h…
