Skip to content
H
Howardismvol. 03 · quiet corner of the web
PLATE II · PIECE № 32HOWARDISM

Context Window Smart Zone

PublishedMay 6, 2026FiledConceptReading5 minSourceAI-synthesised

Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised context; clear-and-restart > compaction; status-line token counting as essential discipline

Illustration for Context Window Smart Zone

Sources#

Summary#

LLMs do not degrade linearly as context grows; they degrade quadratically because attention relationships scale O(n²) with token count. Matt Pocock (citing Dex Hardy of Human Layer) frames this as a smart zone / dumb zone split: the first ~100K tokens of any session is the smart zone where the model performs well; beyond that the model gets "dumber and dumber" regardless of advertised window size. Practical implication: context budget is a real, hard resource — and the agent harness is responsible for keeping individual sessions within the smart zone.

The constraint#

"Every time you add a token to an LLM, it's kind of like you're adding a team to a football league. The number of matches goes up quadratically."

"It doesn't matter whether you're using 1 million context window or 200K, it's always going to be about [100K]. It starts to just get dumber."

Matt Pocock

The 1M-token context windows shipping in 2026 don't move the smart zone — they "just shipped a lot more dumb zone." Long context is useful for retrieval (find a fact in five copies of War and Peace) but not for reasoning (write code that depends on all of it).

Memento metaphor#

Each session is a fresh start. There is no memory across sessions; the model resets to the system prompt every time. This is a constraint but also a feature — clearing context restores smart-zone behavior cheaply. Persistent state must live somewhere the next session can read it (repo, filesystem, Index.md-style index).

Compaction is worse than clearing#

Claude Code's /compact command summarizes the running session into a smaller history. Pocock prefers /clear:

  • Compacted history accumulates "sediment" — distortions and lossy summaries — that degrades subsequent work
  • Clear-and-restart returns to a known-clean baseline (the system prompt)
  • The cost of clearing is paid back by working in the smart zone

The disagreement isn't universal — many developers like compaction because it preserves continuity. The right call depends on whether your task can be resumed cleanly from a written record (then prefer clear) or needs in-flight conversational context (then compaction wins).

Implications for harness design#

  1. System prompt budget. Anything always-in-context comes off the smart-zone budget. "I have seen people put 250K tokens [in the system prompt], then you're just going into the dumb zone before you can even do anything." Keep CLAUDE.md / AGENTS.md as a table of contents, not an encyclopedia (see Agent Harness Engineering on AGENTS.md as ToC).
  2. Sub-agents preserve parent context. A sub-agent runs in its own context window; only its summary returns. Pocock's grill-me skill ran a 93.7K-token sub-agent yet his main session still had ~25K tokens unused.
  3. Fragment work into many sessions. Loops (see Agent Loop Pattern) and vertical slices (see Vertical Slice Tracer Bullets) work because each iteration starts fresh in the smart zone.
  4. Reviewer should run in fresh context. If the implementer used 80K tokens in the smart zone, asking it to review its own work pushes the reviewer into the dumb zone. Cleared context = smart-zone reviewer (see Deep Modules for Agents on push-vs-pull and reviewer placement).
  5. Push vs pull instructions. Always-in-context instructions cost smart-zone tokens; pull-on-demand (skills) costs nothing until invoked.

Status-line token counter as an essential tool#

Pocock recommends a status-line widget showing the exact running token count of each session — without it, developers don't know when they're approaching the dumb zone. He treats this as "absolutely essential information."

Connections#

  • Matt Pocock — popularizer of the smart-zone framing
  • Agent Harness Engineering — system-prompt minimalism and AGENTS.md-as-ToC are restatements of the smart-zone principle
  • Agent Loop Pattern — fragmenting work to stay in smart zone is why loops are powerful
  • Vertical Slice Tracer Bullets — keeping each task small enough to fit in smart zone
  • Design Concept Grilling — the grilling session uses a sub-agent so the parent context stays small
  • Deep Modules for Agents — clearing-before-review is a smart-zone discipline
  • Harness Shrinkage as Models Improve — the smart zone may grow ("the dumb zone has become less dumb lately") but quadratic attention still constrains it
  • AI Brain Fry — human-side analog of the smart zone: oversight has its own degradation curve past capacity, mirroring attention degradation past ~100K tokens
  • Interaction Models — continuous audio/video at 200ms granularity accumulates context fast; TML names long-session context management as an open problem — the same constraint in a new modality

Open questions#

  • Does the smart-zone marker scale with model size, or is it bounded by attention architecture? Pocock observes "the dumb zone has become less dumb lately" but pegs it at 100K through 2026.
  • When sparse-attention or memory-augmented architectures ship, does the smart zone become a soft constraint?
  • How should harnesses surface remaining smart-zone budget to the user — token count, percentage, or a richer signal?

Sources#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

16 articles link here
  • ConceptAgent Harness Engineering

    Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

  • ConceptAgent Loop Pattern

    `/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…

  • ConceptAI Brain Fry

    Kropp et al. 2026/03: mental fatigue from excessive AI oversight increases minor errors +11%, major errors +39%; cognit…

  • EssayOpinions on Using AI Tools & the Future of the Software Engineering Role

    Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…

  • EntityClaude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • ConceptClaude Code Best Practices

    Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…

  • ConceptDeep Modules for Agents

    Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…

  • ConceptDesign Concept Grilling

    Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…

  • ConceptHarness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • ConceptInteraction Models

    Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…

  • EssayLearning to Co-Work with AI: A Software Engineer's Field Guide

    Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…

  • EntityMatt Pocock

    Independent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad…

  • ConceptTime-Aligned Micro-Turns

    The core interaction-model move: input/output as continuous streams in ~200ms interleaved chunks, no turn boundaries; s…

  • EntityTML-Interaction-Small

    TML's first interaction model: 276B MoE / 12B active, audio+video+text in / text+audio out, 200ms micro-turns, async ba…

  • ConceptTurn-Based Interface Bottleneck

    Why current AI interfaces limit collaboration: single-thread turn-taking is a bandwidth bottleneck; humans pushed out b…

  • ConceptVertical Slice Tracer Bullets

    Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…

Related articles
  • ConceptAgent Loop Pattern

    `/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…

  • ConceptDesign Concept Grilling

    Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…

  • ConceptAgent Harness Engineering

    Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

  • ConceptHarness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • ConceptDeep Modules for Agents

    Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…