Sources#
Summary#
Fiona Fung's central claim from running Claude Code + Cowork engineering: for years, engineering bandwidth was the expensive resource — planning, reviews, and process all existed to protect it. Once agentic coding made coding cheap, the bottleneck moved to verification, review, and maintenance. "On the Claude Code team, coding is really not the slow part anymore." The new scarce resource is confidence that the change is correct — and it gets scarcer precisely because bandwidth (and therefore throughput) exploded.
Why verification is now the constraint#
Three forces converge:
- Volume. Bandwidth increased so much that "we have to pay even more attention to: is it correct."
- Blurring roles. More people (designers, managers, PMs) now check in changes, so everyone needs confidence their change is correct.
- Maintenance cost. Higher throughput means more to maintain — the cost of maintenance becomes a first-class concern, not an afterthought.
This is the org-level mirror of Karpathy's The Verifiability Thesis ("LLMs automate what you can verify") and the demand side of Harness Shrinkage as Models Improve (prompt scaffolding shrinks; mechanical verification stays load-bearing).
TDD loses its tax#
A vivid sign of the shift: TDD used to feel like "eating broccoli" — write the failing test first, verify it fails, then fix. With Claude, Fung found it "so much more fun and pleasurable… it took the tax out of test-driven development." The economics flipped: when writing the test is nearly free, the discipline that grounds verification (a test that provably fails, then passes) is pure upside. (Cf. the tdd / red-green-refactor discipline; the failing-test-first step is the verifier.)
Shift left#
Her recurring phrase: shift left — catch problems closer to the source via automation, not after a customer hits them. "What's better than me running into the bug first? Having automation in place to catch it closer to the source." As throughput rises, the only way verification keeps up is by being automated and early rather than manual and late.
Who reviews — and the human-in-the-loop line#
Before shipping Claude Code's own code-review feature, "how do you keep up with code reviews?" was her most-asked question. The answer: Claude Code review handles style, lint, obvious bugs, and spec-drift (if you check the spec into the codebase, "Claude is very good about verifying against spec drift"). But humans stay in the loop where it matters: legal review, risk tolerance, trust boundaries — "trust but verify, and where humans bring needed expertise." The division of labor: automate the mechanical verification, reserve human judgment for risk and trust-boundary calls. (Cf. Deep Modules for Agents: reviewer in a fresh context.)
Measuring the shift (and a trap)#
Signals she watches: onboarding ramp-up time ↓, PR cycle time ↓, Claude-assisted commits ↑ ("I haven't seen a commit that wasn't Claude-assisted in months"). The trap: don't read end-to-end PR cycle time alone — break it into funnel chunks. If cycle time isn't dropping, it may not be low AI adoption; it could be CI/build systems jamming under the new throughput. And throughput isn't the goal — "find some way to measure whatever you're actually trying to solve," not just velocity.
Connections#
- Fiona Fung — author of the thesis
- The Verifiability Thesis — Karpathy's "automate what you can verify" is the model-level cause; this is the org-level consequence
- Harness Shrinkage as Models Improve — the synthesis it confirms: scaffolding shrinks, mechanical verification doesn't
- Evals as Product Spec — Cat Wu's evals are verification encoded as product spec; the PM-side companion
- Code as Source of Truth — checking the spec into the repo is what lets Claude verify spec drift
- Building Is Cheap, Arguing Is Expensive — the upstream half: generation is cheap, so verification (and judgment) is where cost concentrates
- Claude Code Auto Mode — the auto-approve classifier is verification automation at the permission layer
- Deep Modules for Agents — reviewer-in-fresh-context is the verification-quality move at the code-review layer
- AI Brain Fry — the risk if verification stays manual: oversight fatigue increases errors as volume grows
- AI-Driven Formal Proof Search — the extreme case: a compiler as the verifier, so the bottleneck is fully mechanized
Open Questions#
- Fung's own open question: "How far do you push fully automated reviews?" — where's the speed/safety balance, and how do you keep humans confident without re-introducing the review bottleneck?
- If CI/build is the hidden jam, does verification infrastructure (test runners, CI capacity) become the actual capex of an AI-native org?
Sources#
Cited by 13
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Building Is Cheap, Arguing Is Expensive
"In technical debate, code wins": generate three PRs vs whiteboard; prototype over design doc; reduce design docs
- Cat Wu
Head of Product for Claude Code and Cowork at Anthropic; primary articulator of AI-native product cadence and engineer-…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Code as Source of Truth
Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification
- Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
- Dogfooding as Product Discipline
Product sense is built by relentless first-hand use ("ant food"); Mr. Peanut catch; cross-source (Cat Wu vibe-checks, G…
- Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
- Fiona Fung
Leads engineering + product for Claude Code and Cowork at Anthropic (ex-Meta/Microsoft); "what served you prior may no…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Product Velocity as Moat
Shipping speed as differentiator + trust signal ("you'll scale with us"); a treadmill that must convert into durable lo…
- The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
- Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…
Related articles
- Fiona Fung
Leads engineering + product for Claude Code and Cowork at Anthropic (ex-Meta/Microsoft); "what served you prior may no…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Boris Cherny
Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
