H
Howardism
Plate IIAI Engineering中文HOWARDISM

Build for the Next Model

PublishedJune 7, 2026FiledConceptDomainAI EngineeringTagsAI Coding WorkflowProduct StrategyModel ImprovementReading6 minSourceAI-synthesised

Prototype the thing that almost works, not the thing that already works: bet that the next concrete model release (not a far-future AGI) fixes what your engineering can't; Claude Design's Opus 4.7 payoff is the cleanest case

Illustration for Build for the Next Model

Sources#

Summary#

A product-strategy corollary of Harness Shrinkage as Models Improve, now stated independently by three Anthropic voices: don't build the thing that already works — prototype the thing that almost works, and bet that the next model release closes the gap. Dan Carey gives it the cleanest case: Claude Design shipped with a list of problems the team "did not fix with clever engineering… we fixed them with Opus 4.7 coming out." Boris Cherny built Claude Code knowing "it wouldn't have PMF for 6 months because we were building for the next model." Cat Wu frames the discipline as "build products that don't necessarily work yet so that you know what is missing… and then with the newest model you can just swap it in." Because models improve rapidly, engineering effort spent forcing today's model to do what next quarter's model will do for free is wasted — "the model releases are a tide that lifts all boats."

The Carey statement (and why it's the clearest)#

"You do not want to work on the thing that already works. You often want to prototype the thing that almost works… The next model may just fix the issues that you cannot solve via engineering. We had this with Claude Design… We fixed them with Opus 4.7 coming out."

This is the rare retrospective, concrete confirmation of the bet: a named product (Claude Design), a named model (Claude Opus 4.7), and a specific outcome (unsolved prototype gaps closed by the release rather than by engineering). Boris and Cat state the strategy prospectively; Carey shows it paying off.

The crucial calibration: next model, not a strawman AGI#

The bet is easy to misread as "build for some imagined super-AI." Cat Wu guards exactly against that — her stance recorded on her entity page is "build for the current model": "It's very easy to build the product for the super-AGI strong model. The hard thing is figuring out for the current model, how do you elicit the maximum capability?" These reconcile into one rule:

  • Don't build for today's model only → you under-shoot, and ship something that's obsolete the moment the next release lands.
  • Don't build for a far-future AGI strawman → you over-shoot, and ship vaporware that depends on capability nobody has.
  • Build for the next concrete release (~the model ~6 months out) → you prototype "the thing that almost works," ship it as a research preview, and let the next release — which you can reasonably forecast — close the gap.

Carey names the target the prototype is reaching for: not completeness but "that hint of magic… something that could become [complete] in the future."

Why this follows from the bitter lesson#

This is the product-side expression of The Bitter Lesson and Harness Shrinkage as Models Improve: capability migrates into the model over releases, so scaffolding built to compensate for a current limitation is a depreciating asset. If a gap is the kind that scales away (reasoning, instruction-following, multimodal fidelity), patching it with engineering is building a crutch you'll soon delete. The discipline is to identify which gaps are "wait for the model" gaps versus which are durable harness work (Harness Shrinkage as Models Improve's caveat: mechanical verification, security, brand/character don't migrate inward).

The tension to hold#

"Prototype the thing that almost works" is in direct tension with Problem-Solution Fit Discipline's prototype-as-evidence trap: a fast prototype proves the build was tractable, not that the problem is real. The reconciliation: build-for-the-next-model is about capability risk (will the tech get there? — yes, wait for it), not market risk (does anyone want this? — the prototype doesn't answer that). You still validate demand through users; you just don't burn engineering forcing a capability the next model will hand you. Carey's own safeguard is that the bet rides on top of Compounding Loop Optimization and daily user contact — the "shape of the product" is validated continuously even while specific capability gaps are left for the model to close.

Connections#

Open Questions#

  • How do you tell a "wait for the model" gap from a durable-harness gap before the next release? Get it wrong and you either ship vaporware or build a crutch you'll delete.
  • The bet depends on a reliable release cadence and a forecastable capability curve (Task Time-Horizon Scaling). What happens to "build for the next model" if model improvement stalls (the stalled-but-diffused future)?
  • Does the strategy generalize outside frontier labs, who have privileged visibility into the next model? An external team is betting on a release it can't see.

Sources#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 13
  • Anthropic Labs

    Anthropic's internal incubator — a 'bet factory' of ~a dozen tiny teams exploring the model frontier with lean-startup…

  • Claude Design

    Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…

  • Claude Opus 4.7

    GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…

  • Compounding Loop Optimization

    Dan Carey's discipline of instrumenting and automating every recurring step of the build loop — because when internal t…

  • Dan Carey

    Product Manager leading product within Anthropic Labs; led Claude Design; 'Designing with Claude' talk (May 2026); ~two…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • AI Engineering & Agent Tooling

    Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.

  • Open Questions Backlog

    _96 pages with open questions, as of 2026-06-14._

  • The PRD-Replacement Spectrum at AI-Native Speed

    Four positions (grill-then-PRD → lighter-PRD → build-to-decide → prototype-is-spec) are one spectrum once you decompose…

  • Problem-Solution Fit Discipline

    Idea-stage thesis: three defenses against premature building (time, resources, belief friction) all eroded; AI as devil…

  • Prototype Over PRD

    Dan Carey's prototype-replaces-PRD method: record a why-not-what conversation, transcribe it, hand the transcript to Cl…

  • Task Time-Horizon Scaling

    METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…

  • The Bitter Lesson

    Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…

Related articles
  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Claude Design

    Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…

  • Compounding Loop Optimization

    Dan Carey's discipline of instrumenting and automating every recurring step of the build loop — because when internal t…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • HTML as the New Markdown

    Thariq Shihipar's thesis: as models improve, thousand-line markdown plans overwhelm the *human*; HTML artifacts (visual…