Sources#
Summary#
John Ousterhout's A Philosophy of Software Design distinguishes deep modules (small interface, large behavior) from shallow modules (small interface, little behavior, but many of them). Matt Pocock applies the distinction to agent-friendly codebases: agents thrive in deep-module codebases because the test boundary is clear, the dependency graph is shallow, and the developer can delegate implementation while keeping the interface in mind. Shallow-module sprawl is what AI produces by default if you don't push back, and it produces unreviewable, untestable codebases.
Deep vs shallow#
- Shallow module: many small files, each exposing many small functions, dense dependency graph between them. Test boundaries are unclear — do you mock every neighbor? Test units in isolation and miss integration bugs? AI can't see the whole picture, can't decide where the abstractions are.
- Deep module: one larger module with a small public interface and lots of internal logic. One natural test boundary: the public interface. AI can see what the module does without traversing dependencies. Implementation is delegable because the interface is the contract.
Why agents specifically benefit#
- Fewer dependency hops to traverse. Smart-zone budget (see Context Window Smart Zone) is conserved.
- Test boundaries are obvious. A reviewer agent can verify behavior at the interface without micro-managing internals.
- Implementation is delegable. Pocock's "gray box" pattern — design the interface yourself, hand off implementation to the agent. You retain a model of what without burning attention on how.
- The module map is finite. A PRD that says "modify gamification service, dashboard route, lesson route" is concrete; the same PRD against a shallow codebase would say "modify 47 files."
The risk: agents drift toward shallow#
Pocock's observation:
"If you don't watch AI carefully, it's going to produce a code base that looks like [shallow]. So you need to be really, really careful when you're directing it."
Reasons agents drift shallow:
- Each task is small; the agent makes the smallest change that works
- Without a global module map, the agent doesn't know which existing module to extend, so it makes a new one
- "Single-responsibility principle" misapplied — the agent wraps every helper in its own file
The fix is twofold:
- Keep the module map in the PRD (see Design Concept Grilling) so the agent knows what to extend.
- Periodically run a refactor pass that consolidates shallow modules into deep ones. Pocock has a skill for this:
improve-code-base-architecture— scans the codebase for "architectural improvement candidates" (clusters of related modules that could be deepened), with arguments and dependency category for each.
Push vs pull instructions#
A subtle but high-leverage architectural choice: how to deliver coding standards and architectural rules to agents.
| Mode | Mechanism | When |
|---|---|---|
| Push | Always-in-context (CLAUDE.md, system prompt) | Reviewer agents — they need to know the standards to compare against the code |
| Pull | On-demand via skill (agent fetches when relevant) | Implementer agents — pulling avoids burning smart-zone budget on rules that don't apply |
This is also why a clean reviewer is smarter than a same-context reviewer: the implementer can pull rules as needed; the reviewer benefits from having them pushed plus having a clean smart-zone window to actually evaluate the code.
Reviewer in fresh context#
If implementation used 80K tokens of smart zone, a same-context reviewer is reading the diff in the dumb zone. Clearing the context and running review fresh restores smart-zone reasoning. Pocock pairs this with model selection: Sonnet for implementation, Opus for review — "I need the smarts then."
The Sandcastle three-agent pattern#
Pocock's parallelization library bakes deep-module discipline into its architecture:
- Planner — picks N parallel issues from the backlog
- N implementers — one per issue, each in its own git worktree + Docker sandbox; coding standards available via pull skills
- Reviewer — runs in fresh context per implementer's diff; coding standards pushed into its system prompt
- Merger — reconciles all approved branches, fixes type / test conflicts
Each agent runs in its own smart zone. Each module-level change is reviewed at the interface, not the implementation.
Why "large model = no design needed" is wrong#
The seductive argument: "models are smart enough now to navigate any codebase, design doesn't matter." Pocock's counter:
"Bad code bases make bad agents. If you have a garbage code base, you're going to get garbage out of the agent that's working in that code base."
The smart-zone constraint (see Context Window Smart Zone) is structural, not just a function of model size. Architecture choices that make the agent's job harder cost smart-zone budget that better architecture would have saved.
Connections#
- Matt Pocock — primary articulator
- Context Window Smart Zone — deep modules conserve smart-zone budget
- Vertical Slice Tracer Bullets — slices cut through deep modules at the interface, exercising the natural test boundary
- Design Concept Grilling — module map in the PRD operationalizes deep-module discipline at planning time
- Agent Loop Pattern — review-in-fresh-context fits the loop's clear-then-restart rhythm
- Agent Harness Engineering — "enforce invariants, not implementations" is the same principle at the orchestration layer
- Claude Code Best Practices — module map in CLAUDE.md sits in the same family
Open questions#
- How big is "deep enough"? Pocock's example modules are several hundred LOC; Ousterhout's textbook examples are larger. There's a sweet spot; not articulated.
- For ports/adapters codebases, does the deep-module advice transfer cleanly? The "small interface" is the port; the "large behavior" is the adapter. Probably yes, but not exercised in source.
- Refactor cost vs benefit: when is "improve-code-base-architecture" worth running on a working repo?
Sources#
11 articles link here
- ConceptAgent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- ConceptAgent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- EntityClaude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- ConceptClaude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- ConceptContext Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
- ConceptDesign Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- ConceptInteraction / Background Model Split
Dual-model architecture: time-aware interaction model stays present; async background model handles deep reasoning/tool…
- EssayLearning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
- EntityMatt Pocock
Independent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad…
- ConceptModel Introspection Feedback
Cat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticis…
- ConceptVertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
Related articles
- ConceptDesign Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- ConceptAgent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- ConceptAgent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- ConceptContext Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
- ConceptHarness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
