Howardism

Sources#

Interaction Models: A Scalable Approach to Human-AI Collaboration

Summary#

Interaction Models are architected as two cooperating models:

a time-aware interaction model that maintains real-time presence — perceiving and responding in a continuous loop (see Time-Aligned Micro-Turns);
an asynchronous background model that handles sustained reasoning, tool use, and longer-horizon work.

The payoff: the user gets both responsiveness and depth — "the planning, tool-use, and agentic workflows of reasoning models at the response latency of non-thinking ones."

How delegation works#

When a task needs deeper reasoning than can be produced instantly, the interaction model delegates to the background model, which runs asynchronously.
The handoff is a rich context package — not a standalone query, but the full conversation.
The interaction model stays present throughout — answering follow-ups, taking new input, holding the thread.
Results stream back as the background model produces them; the interaction model interleaves updates into the conversation at a moment appropriate to what the user is currently doing — not as an abrupt context switch.

Both halves are intelligent#

This isn't a "dumb frontend, smart backend" design. The interaction model on its own is "competitive on both interactive and intelligence benchmarks" — see Interactivity Benchmarks (e.g. TML-Interaction-Small beats every non-thinking baseline on Audio MultiChallenge APR even without the background agent; benchmarks marked * use the background agent for reasoning/tool tasks).

Relationship to other multi-model patterns#

This is the latency-vs-depth axis of multi-model orchestration, distinct from:

the role-based model selection in Client-Side Agent Optimization (assign cheap/expensive models per role in an agent graph) — there the split is cost-driven and static; here it's latency-driven and dynamic-per-turn;
the three-agent / reviewer-in-fresh-context pattern (Deep Modules for Agents, Agent Harness Engineering) — there the split is for context isolation; here it's for temporal concerns (stay responsive vs. think hard).

Open / acknowledged#

TML calls background agents "an essential capability" they've "just scratched the surface" on — both pushing background agentic intelligence to the frontier and exploring how background agents work together with the interaction model.

Connections#

Interaction Models — parent concept
Time-Aligned Micro-Turns — what keeps the interaction model present while the background model thinks
Interactivity Benchmarks — *-marked results use the background agent; shows the split's contribution
Client-Side Agent Optimization — a different axis of multi-model design (cost/role, not latency/depth)
Deep Modules for Agents / Agent Harness Engineering — multi-agent splits for context isolation rather than latency
Harness Shrinkage as Models Improve — open question whether the split is permanent or a transitional artifact until one model is both fast and deep enough

Sources#

Interaction Models: A Scalable Approach to Human-AI Collaboration

Interaction / Background Model Split

Sources#

Summary#

How delegation works#

Both halves are intelligent#

Relationship to other multi-model patterns#

Open / acknowledged#

Connections#

Sources#