Howardism

Sources#

Full Walkthrough: Workflow for AI Coding — Matt Pocock

Summary#

LLMs do not degrade linearly as context grows; they degrade quadratically because attention relationships scale O(n²) with token count. Matt Pocock (citing Dex Hardy of Human Layer) frames this as a smart zone / dumb zone split: the first ~100K tokens of any session is the smart zone where the model performs well; beyond that the model gets "dumber and dumber" regardless of advertised window size. Practical implication: context budget is a real, hard resource — and the agent harness is responsible for keeping individual sessions within the smart zone.

The constraint#

"Every time you add a token to an LLM, it's kind of like you're adding a team to a football league. The number of matches goes up quadratically."

"It doesn't matter whether you're using 1 million context window or 200K, it's always going to be about [100K]. It starts to just get dumber."

— Matt Pocock

The 1M-token context windows shipping in 2026 don't move the smart zone — they "just shipped a lot more dumb zone." Long context is useful for retrieval (find a fact in five copies of War and Peace) but not for reasoning (write code that depends on all of it).

Memento metaphor#

Each session is a fresh start. There is no memory across sessions; the model resets to the system prompt every time. This is a constraint but also a feature — clearing context restores smart-zone behavior cheaply. Persistent state must live somewhere the next session can read it (repo, filesystem, Index.md-style index).

Compaction is worse than clearing#

Claude Code's /compact command summarizes the running session into a smaller history. Pocock prefers /clear:

Compacted history accumulates "sediment" — distortions and lossy summaries — that degrades subsequent work
Clear-and-restart returns to a known-clean baseline (the system prompt)
The cost of clearing is paid back by working in the smart zone

The disagreement isn't universal — many developers like compaction because it preserves continuity. The right call depends on whether your task can be resumed cleanly from a written record (then prefer clear) or needs in-flight conversational context (then compaction wins).

Implications for harness design#

System prompt budget. Anything always-in-context comes off the smart-zone budget. "I have seen people put 250K tokens [in the system prompt], then you're just going into the dumb zone before you can even do anything." Keep CLAUDE.md / AGENTS.md as a table of contents, not an encyclopedia (see Agent Harness Engineering on AGENTS.md as ToC).
Sub-agents preserve parent context. A sub-agent runs in its own context window; only its summary returns. Pocock's grill-me skill ran a 93.7K-token sub-agent yet his main session still had ~25K tokens unused.
Fragment work into many sessions. Loops (see Agent Loop Pattern) and vertical slices (see Vertical Slice Tracer Bullets) work because each iteration starts fresh in the smart zone.
Reviewer should run in fresh context. If the implementer used 80K tokens in the smart zone, asking it to review its own work pushes the reviewer into the dumb zone. Cleared context = smart-zone reviewer (see Deep Modules for Agents on push-vs-pull and reviewer placement).
Push vs pull instructions. Always-in-context instructions cost smart-zone tokens; pull-on-demand (skills) costs nothing until invoked.

Status-line token counter as an essential tool#

Pocock recommends a status-line widget showing the exact running token count of each session — without it, developers don't know when they're approaching the dumb zone. He treats this as "absolutely essential information."

Connections#

Matt Pocock — popularizer of the smart-zone framing
Agent Harness Engineering — system-prompt minimalism and AGENTS.md-as-ToC are restatements of the smart-zone principle
Agent Loop Pattern — fragmenting work to stay in smart zone is why loops are powerful
Vertical Slice Tracer Bullets — keeping each task small enough to fit in smart zone
Design Concept Grilling — the grilling session uses a sub-agent so the parent context stays small
Deep Modules for Agents — clearing-before-review is a smart-zone discipline
Harness Shrinkage as Models Improve — the smart zone may grow ("the dumb zone has become less dumb lately") but quadratic attention still constrains it
AI Brain Fry — human-side analog of the smart zone: oversight has its own degradation curve past capacity, mirroring attention degradation past ~100K tokens
Interaction Models — continuous audio/video at 200ms granularity accumulates context fast; TML names long-session context management as an open problem — the same constraint in a new modality

Open questions#

Does the smart-zone marker scale with model size, or is it bounded by attention architecture? Pocock observes "the dumb zone has become less dumb lately" but pegs it at 100K through 2026.
When sparse-attention or memory-augmented architectures ship, does the smart zone become a soft constraint?
How should harnesses surface remaining smart-zone budget to the user — token count, percentage, or a richer signal?

Sources#

Full Walkthrough: Workflow for AI Coding — Matt Pocock — primary articulation

Context Window Smart Zone

Sources#

Summary#

The constraint#

Memento metaphor#

Compaction is worse than clearing#

Implications for harness design#

Status-line token counter as an essential tool#

Connections#

Open questions#

Sources#