H
Howardismvol. 03 · quiet corner of the web
Plate IIInteractionHOWARDISM

Human-in-the-Loop Boundaries

PublishedMay 28, 2026FiledEssayTopicInteractionTagsDerivedHuman AI CollaborationVerificationAccountabilityAgent EngineeringReading9 minSourceAI-synthesised

Humans belong at allocation, understanding, design-concept, risk, and accountability boundaries; they slow the system down as manual executors, universal reviewers, or ceremonial approvers

Illustration for Human-in-the-Loop Boundaries

Core answer#

Humans should stay in the loop where the work requires understanding, allocation, accountability, risk tolerance, or shared design concept. They slow the system down when they occupy the loop as manual executors, universal reviewers, or ceremonial approvers for work that can be generated, checked, or routed mechanically.

The boundary is not "human vs AI." It is judgment vs throughput:

  • Keep humans in the loop for deciding what is worth spending compute on, what the work means, which tradeoffs are acceptable, and where trust boundaries sit.
  • Push humans out of the loop for rote production, style/lint review, obvious bug finding, repetitive spec-drift checks, and low-stakes routing that can be automated early.
  • Redesign the loop when the human is nominally accountable but cognitively overloaded; that is the AI Brain Fry / rubber-stamp zone.

Boundary table#

Work surfaceHuman should stay in the loopHuman is slowing the system down
Problem selectionDeciding what is worth building or investigating. This is the Compute Allocator role: spend cheap generation on the right problem.Asking humans to manually produce all candidate plans/artifacts before the model explores the option space.
UnderstandingBuilding the internal model needed to direct and judge agents. Outsource Your Thinking, Not Your Understanding makes this non-delegable: thinking/search can be outsourced; understanding cannot.Treating every intermediate reasoning step as something the human must personally perform. The agent can do the thinking; the human must understand enough to steer and verify.
Design conceptFuzzy, high-stakes, multi-branch decisions need Design Concept Grilling before planning. The human stays until there is shared understanding.For short, well-scoped, low-ambiguity edits, grilling is overhead. The page explicitly says to skip it for narrow changes like renaming a function.
VerificationHumans stay on legal review, risk tolerance, trust boundaries, and expertise-heavy calls. Verification as the New Bottleneck names these as the remaining human review zone.Style, lint, obvious bugs, and spec-drift checks belong to automated review, CI, tests, and Claude review. Manual review here becomes the bottleneck.
AccountabilityHumans own decisions, deployment, approval, and consequences. AI Employee Framing shows why accountability cannot be assigned to the agent.Naming agents as employees/teammates or putting them on org charts diffuses responsibility, increases escalation, and reduces error catching without improving adoption.
Oversight volumeHumans should review concentrated high-stakes checkpoints and system-level behavior.Per-output universal review does not scale. AI Brain Fry shows excessive oversight raises error rates; the human becomes a tired rubber stamp.

The four places humans belong#

1. Allocation: choosing where compute goes#

Compute Allocator is the cleanest role definition: once generation is cheap, the scarce act is deciding what deserves generation. Thariq's 1% / 99% split makes the point: most generated tokens may be scaffolding, plans, interfaces, or disposable alignment artifacts; the small production residue is valuable because the surrounding compute made the decision better.

So the human belongs at the allocation boundary:

  • Which problem is worth attacking?
  • Which options deserve exploration?
  • Which artifact will make the next decision legible?
  • When is more generation useful, and when is it noise?

Putting the human deeper in the production loop is often scarcity-era behavior. If the model can generate ten options cheaply, the human should not spend the same attention drafting option one. They should judge the option set and redirect compute.

2. Understanding: knowing enough to direct#

Outsource Your Thinking, Not Your Understanding is the hard stop against blind delegation. Agents can do search, drafting, synthesis, and many intermediate steps. But the human still needs the internal model that answers:

  • What are we trying to build?
  • Why is it worth doing?
  • What would count as correct?
  • What failure would matter?

This is why a knowledge base is not just retrieval. It is an understanding tool: multiple projections over the same information force the human to build a mental model. If the human lacks that model, "human in the loop" collapses into approval theater.

The practical boundary: delegate thinking steps, but do not delegate the understanding required to evaluate the result.

3. Design concept: aligning before plans exist#

Design Concept Grilling identifies the upstream loop that should remain human-heavy: before a plan, reach the design concept. This is where the human's preferences, constraints, taste, and domain model are load-bearing.

The grilling loop is justified when:

  • the brief is fuzzy;
  • branches have dependencies;
  • the cost of choosing the wrong direction is high;
  • the agent would otherwise rush into a plausible but wrong plan;
  • the work needs shared understanding before implementation.

The same page gives the limiting rule: skip grilling for small, well-scoped work. Human-in-the-loop is not a religion. If the change is obvious and bounded, the interview just burns attention.

4. Risk and accountability: owning trust boundaries#

Verification as the New Bottleneck draws the review line: automate mechanical verification, keep humans for legal review, risk tolerance, trust boundaries, and expertise. That is not a vague "humans supervise AI" slogan. It is a division of labor:

  • machines check the things with crisp predicates;
  • humans decide the things whose acceptability depends on context, consequences, and responsibility.

AI Employee Framing explains why this line matters. When AI is framed as an employee, managers with real AI-agent exposure show lower personal accountability, more escalation, and worse error catching. The agent cannot own the outcome. The human or organization deploying it owns the outcome.

Where humans should get out of the way#

1. Mechanical review#

If a check can be expressed as a test, lint rule, typecheck, CI gate, spec-drift comparison, or fresh-context automated review, it should move left into automation. Verification as the New Bottleneck is explicit: coding is no longer the slow part; confidence is. Manual review cannot keep up with exploded throughput unless mechanical checks are automated early.

Humans reviewing every style issue or obvious bug are not adding judgment. They are becoming an expensive queue.

2. Rote production#

The Compute Allocator framing makes production residue secondary. The human does not need to hand-author every plan, draft, prototype, or interface. The model can produce the scaffolding; the human judges whether the scaffolding created clarity.

The failure mode is nostalgia for production as proof of contribution. In this regime, contribution is often deciding what production to ask for.

3. Ceremonial approval#

AI Brain Fry is the warning label on naive oversight. If one human must review too many agent outputs, review quality falls. The loop still has a human in it, but the human is no longer doing meaningful cognition.

That means "keep a human in the loop" is insufficient. The loop must be designed so the human sees:

  • fewer decisions;
  • higher-leverage decisions;
  • better summarized evidence;
  • explicit risk flags;
  • sampled or exception-based review for lower-stakes output.

Otherwise the loop only preserves the appearance of control.

4. Accountability diffusion via employee framing#

Humans also slow the system down when the AI-employee metaphor pushes work upward through unnecessary escalation. AI Employee Framing reports +44% escalation under employee framing among exposed managers. That is not safety; it is role confusion.

The fix is tool-mode orchestration: scoped workflows, decision rights, review gates, and named human owners. Do not ask, "Did the AI employee do a good job?" Ask, "Did this workflow produce an acceptable artifact under the checks and approvals we designed?"

A usable rule#

Keep a human in the loop when the next decision changes direction, responsibility, or risk.

Remove the human from the loop when the next step only applies an already-agreed rule.

Redesign the loop when the human is reviewing more outputs than they can understand.

This reconciles the six pages:

Evidence base#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 1
  • Compute Allocator

    The human's evolving role: deciding what's worth spending compute on; ~1% of generated tokens ship, 99% is scaffolding…

Related articles
  • Open Questions Backlog

    _62 pages with open questions, as of 2026-05-25._

  • AI Brain Fry

    Kropp et al. 2026/03: mental fatigue from excessive AI oversight increases minor errors +11%, major errors +39%; cognit…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Engineer PM Convergence

    Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…

  • Human-AI Accountability Redesign

    HBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/es…