Question#
What's the best way to learn and co-work with AI models and services nowadays — especially for a software engineer who may no longer require heavy coding tasks in a new AI era? Create guidelines for individuals to learn and develop skills for upcoming changes.
TL;DR#
Coding skill is becoming the baseline, not the differentiator. The job is migrating from writing code to deciding what to build, designing the environment the agent works in, and verifying the output. Six skill clusters earn their tokens in 2026 and beyond:
- Product taste — picking the right thing to build (see Engineer PM Convergence, Printing Press Software Democratization)
- Harness engineering — designing the scaffolding around the model (see Agent Harness Engineering, Claude Code Best Practices)
- Alignment-first planning — reach shared design concept before any artifact (see Design Concept Grilling, Vertical Slice Tracer Bullets)
- Architecture for agents — codebase shape that conserves model attention (see Deep Modules for Agents, Context Window Smart Zone)
- Verification & review — mechanical feedback loops + fresh-context review (see Agent Loop Pattern, Harness Shrinkage as Models Improve)
- Strategic positioning — pick moats AI doesn't dissolve (see Seven Powers Applied to AI)
The frame is co-worker, not tool: interview the model, treat its failures as harness signals, build for the model six months out, prune crutches every release. Soft skills (judgment, EQ, taste) and domain knowledge become more valuable, not less.
I. The mindset shift#
From "I write code" to "I decide what gets built and verify it works"#
Boris Cherny's printing-press analogy frames this directly: software-writing is at the same democratization inflection literacy hit in 1400 (Printing Press Software Democratization). Cost of production collapses; what you build for whom becomes the differentiator. Boris's claim — "the best person to write accounting software is a really good accountant, not an engineer, because they know the domain really well and coding is the easy part" — is a directive: invest in domain depth, not coding cleverness.
Cat Wu's blunter version: "As code becomes much cheaper to write, the thing that becomes more valuable is deciding what to write" (Engineer PM Convergence).
Implications for an individual engineer:
- Hire-yourself bar shifts to taste. Cat's hiring bias at Claude Code is "engineers with great product taste." If you can't articulate why feature X matters more than feature Y, that gap is now your bottleneck — not your TypeScript.
- Domain depth compounds. A backend engineer who deeply understands clinical workflows beats a senior engineer with no domain. Pick a domain. Stay long enough to know its tacit constraints.
- Cross-disciplinary range matters more than vertical depth. Cat reports every functional role on the Claude Code team codes — designers ship code, PMs ship code, data scientist codes (Engineer PM Convergence). Reverse direction: engineers who can also do design, PM, or data work compound their leverage.
From "tool I drive" to "co-worker I interview"#
The most underrated technique Cat Wu names: when the agent does something wrong, ask it why (Model Introspection Feedback). Don't re-prompt with corrections. Read the model's account of its own reasoning, then fix the harness — not the model — based on what surfaces.
The reframe: the model's behavior is a function of the harness; the failure is information about the harness. Your job is to design an environment where the model can succeed, not to make the model smarter.
Internalize the constraint: smart zone, not 1M tokens#
Matt Pocock (citing Dex Hardy) frames the hardest constraint: LLMs degrade quadratically with context size because attention is O(n²). The first ~100K tokens are the smart zone; beyond that the model "gets dumber and dumber" regardless of advertised window (Context Window Smart Zone). 1M-token windows shipped in 2026 "just shipped a lot more dumb zone" — useful for retrieval, not reasoning.
Practical: every minute spent learning to manage context budget pays back tenfold. Status-line token counters are essential, not optional.
II. Six skill clusters to develop#
1. Product taste#
What it is: the ability to pick the right thing to build and recognize when a response is on-character or off-character.
How to develop it:
- Ship things, get feedback, iterate fast. AI Native Product Cadence reports Anthropic's Claude Code team going from "see user feedback on Twitter to shipped product by end of week" — the loop tightness is how taste calibrates.
- Maintain a "what would I build differently?" file. When you use a product, note what's wrong and what you'd do instead. Compare your judgment to what the team actually shipped six months later.
- Practice character work. Claude Character as Product shows character (low-ego, lighthearted, bias-toward-action, honest feedback) is real product surface. Try to articulate why a given AI response feels right or wrong — that's the same eval skill in miniature.
- Lunchtime vibe-checks. Cat Wu runs team lunches asking each member "what is your vibe on the model?" before looking at metrics. Qualitative-first, data-second is a discipline you can practice on every model release.
2. Harness engineering#
What it is: designing the scaffolding around the agent — context files, skills, hooks, subagents, permission classifiers, mechanical verifiers (Agent Harness Engineering).
How to develop it:
- Build a CLAUDE.md / AGENTS.md for every project you own. Treat it like code: review when things go wrong, prune ruthlessly, keep it as a table of contents pointing at deeper docs (Claude Code Best Practices). 250K-token system prompts push the model into the dumb zone before it does anything.
- Practice push-vs-pull discipline (Deep Modules for Agents): always-in-context (CLAUDE.md, system prompt) for reviewer agents who need standards to compare against; on-demand skills for implementer agents.
- Run the introspection-debugging loop. When an agent fails, ask it why, then fix the harness — not the model.
- Read your own system prompt at every model launch. Cat Wu's discipline at Claude Code: "We read through the entire system prompt and reflect on, for each section, does the model really need this reminder anymore? If not, remove it." Most teams only add — subtract on cadence (Harness Shrinkage as Models Improve).
3. Alignment-first planning#
What it is: reaching shared understanding (the "design concept" in Frederick Brooks's sense) before any artifact. The output of grilling is alignment; PRDs and plans are downstream (Design Concept Grilling).
How to develop it:
- Adopt a
grill-mediscipline. Matt Pocock's skill, verbatim: "Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the decision tree, resolving dependencies one by one. For each question provide your recommended answer. Ask the questions one at a time." Use this on yourself before writing PRDs. - Reject specs-to-code as vibe coding. Pocock's strong claim: writing a careful spec, handing it to AI, and refusing to look at the code is vibe coding by another name. The code is the battleground, not the spec (Design Concept Grilling).
- Slice vertically, not horizontally. Don't do "all schema → all services → all UI." Do "thin slice through every layer, end-to-end, then the next slice" (Vertical Slice Tracer Bullets). Agents default to horizontal — push back actively.
- Build a Kanban with explicit blocking edges, not a phase plan. A numbered phase list locks one agent into sequential execution; a Kanban with
blocked-by:lets multiple agents drain it in parallel (Agent Loop Pattern).
4. Architecture for agents#
What it is: codebase shape that lets agents work effectively — deep modules, clear test boundaries, conserved smart-zone budget (Deep Modules for Agents).
How to develop it:
- Internalize Ousterhout's deep-vs-shallow distinction. Deep module = small interface, large behavior, one natural test boundary. Shallow module = many small files, dense graph, unclear boundaries. Agents drift toward shallow by default; push back.
- Keep a module map in your PRD. When planning, name the modules to be modified explicitly. This connects planning to architecture and prevents the agent from inventing new shallow modules instead of extending existing deep ones.
- Run periodic refactor passes that consolidate. Pocock's
improve-code-base-architectureskill scans for clusters of related shallow modules and proposes deepening them. Schedule this work — it doesn't happen on its own. - Reviewer in fresh context. If implementation used 80K tokens of smart zone, a same-context reviewer reads the diff in the dumb zone. Clear and review fresh (Deep Modules for Agents, Context Window Smart Zone).
- Pair with model selection. Matt Pocock: Sonnet for implementation, Opus for review — "I need the smarts then."
5. Verification and review#
What it is: mechanical feedback loops (tests, types, linters, lints-as-instructions) that set the ceiling on what loops can do. Without good verification, you are coding blind (Agent Loop Pattern, Agent Harness Engineering).
How to develop it:
- Treat tests/types/linters as the ceiling. Matt Pocock: "If your code base doesn't have feedback loops, you're never ever ever going to get decent AI output. The quality of your feedback loops influences how good your AI can code. That is the ceiling." Invest in this infrastructure before scaling agent use.
- Write lint error messages as remediation instructions. OpenAI's Codex team writes lint error messages as instructions injected directly into agent context — the agent reads the lint output and knows how to fix it (Agent Harness Engineering).
- Adopt the AFK vs human-in-loop split (Agent Loop Pattern). AFK tasks (implementation, refactoring, doc gardening, CI healing) are loop-eligible. Human-in-loop tasks (alignment, design choices, prioritization, QA) are not. Trying to loop human-in-loop work produces drift.
- Prepare for the new bottleneck: review. Matt Pocock's confession and Cat Wu's same observation: when agents ship more code, humans review more code. The unsolved 2026 problem. Develop your code-review fluency now — it's the durable skill loops can't replace.
6. Strategic positioning#
What it is: choosing problems and moats that survive the AI shift, not ones that erode under it (Seven Powers Applied to AI).
How to develop it:
- Audit the moat of any business / project / role you bet on. Process power and switching costs erode under AI; network effects, scale economies, and cornered resources persist. Counter-positioning amplifies — startups can choose business models incumbents structurally can't.
- At the personal-career level, the same logic applies. "I have 15 years of process knowledge nobody else has" is process-power that AI is now hill-climbing. "I have a network of trusted relationships in this niche" is network effects that AI doesn't replicate.
- Build AI-native from day one. Boris Cherny: a startup builds AI-native; an incumbent has to retrain people, change processes, overcome internal resistance. The same applies to your individual workflow — rebuild your habits AI-native rather than bolt AI onto pre-AI workflows.
III. Daily practices#
| Practice | Cadence | Source |
|---|---|---|
Run a grill-me session before any non-trivial feature | Per feature | Design Concept Grilling |
/clear between unrelated tasks | Every task switch | Claude Code Best Practices |
| Keep a status-line token counter visible | Always | Context Window Smart Zone |
| Slice work vertically; reject horizontal phasing | Per planning session | Vertical Slice Tracer Bullets |
| Reviewer agent in fresh context (different model OK) | Per non-trivial diff | Deep Modules for Agents |
| Read your CLAUDE.md / system prompt at every model release; prune | Per model launch | Harness Shrinkage as Models Improve |
| Ask the model why it failed before re-prompting | On any unexpected behavior | Model Introspection Feedback |
| Run AFK loops on Kanban backlogs overnight | Continuously | Agent Loop Pattern |
| Build for the model six months out, not today's | Strategic horizon | Harness Shrinkage as Models Improve |
| Lunchtime vibe-check on new model releases | Per model release | Claude Character as Product |
IV. Anti-patterns to unlearn#
| Anti-pattern | Why it fails | What to do instead |
|---|---|---|
| Treating context window as "1M tokens, plenty of room" | Quadratic attention; ~100K smart zone is real | Status-line counter; /clear aggressively; subagents for investigation |
| Adding to system prompt forever, never removing | Crutches accrete; old crutches contradict new model behavior | Prune at every model launch; every section must justify its tokens |
| Asking agent for a plan before alignment | Agent papers over open questions; rework cost paid in implementation | grill-me first; PRD only after alignment |
| Horizontal layered phases ("all schema, then all service") | No end-to-end feedback until phase 3; mismatches paid late | Vertical slices; tracer-bullet thin paths |
| Same-context reviewer | Implementer's smart-zone is exhausted; reviewer in dumb zone | Fresh context for review; consider stronger model for review |
| Specs-to-code without engaging the code | "Vibe coding by another name" — feedback loop runs through wrong layer | Stay in the code; specs are downstream of alignment |
| Looping human-in-loop work | Agent makes plausible-but-wrong calls; drift accumulates | AFK tagging; human-in-loop tasks stay synchronous |
| "Bigger model = no design needed" | Bad codebases produce bad agents regardless of model size | Deep modules; mechanical verification |
| Treating model failure as "model is dumb" | Misses signal about harness gaps | Introspect: ask the model why; fix harness |
| Defending switching-cost / process-power moats | These erode under AI | Pivot to network effects / scale / cornered resources / counter-positioning |
V. What stays human#
Cat Wu explicitly names what isn't merging into the model: tacit, common-sense, EQ-heavy work — knowing the right venue to communicate with stakeholders, sensing when a launch is ready, knowing what counts as a fair trade-off (Engineer PM Convergence). Humans still provide the connective tissue across a launch.
Concretely durable human skills:
- Code review fluency — the new bottleneck once agents ship faster (AI Native Product Cadence, Matt Pocock confession)
- Convicted articulation — Amanda's character-work skill: saying why a given output is on-character or off-character with conviction (Claude Character as Product)
- Cross-functional EQ — knowing when to escalate, what the right venue is, how to read a stakeholder's reluctance
- Mission/values clarity as tiebreaker — Cat: "If there's two competing priorities, we'll talk about which one is more important for Anthropic's mission." Removes coordination cost (AI Native Product Cadence)
- Domain depth — the accountant who can now write accounting software beats the engineer with no accounting context (Printing Press Software Democratization)
VI. A 90-day learning plan#
Days 1–14 — Get fluent in the harness.
- Set up Claude Code or equivalent with status-line token counter
- Write a CLAUDE.md / AGENTS.md for one project; prune it weekly
- Practice
/clearbetween tasks; observe how it changes output quality - Read Claude Code Best Practices, Agent Harness Engineering, and the source raws
Days 15–30 — Adopt alignment-first planning.
- Install or write a
grill-meskill; use it before any feature - Slice your next two features vertically; resist horizontal layering
- Convert your todo list into a Kanban with
blocked-by:edges
Days 31–60 — Build mechanical feedback infrastructure.
- Add or strengthen tests/types/linters in one project until they catch agent drift
- Write lint error messages as remediation instructions
- Set up reviewer-in-fresh-context for non-trivial diffs (different model preferred)
Days 61–90 — Run AFK loops; develop product taste.
- Set up a Ralph loop or
/loopcron on your Kanban backlog overnight - Keep a "what would I build differently?" journal for products you use
- Practice introspection-debugging: when the agent fails, ask why, fix harness
- Audit the moat of one business / project / domain you care about against Seven Powers Applied to AI
VII. Source confidence and gaps#
- High confidence: smart-zone framing, harness shrinkage, vertical slicing, deep modules, AFK/human-in-loop split, introspection technique. Multiple converging sources from inside Anthropic and from independent practitioners (Matt Pocock).
- Medium confidence: 100-line Claude Code prediction (hyperbolic by Boris Cherny's own framing); printing-press analogy timeline (faster than 50 years, exact rate uncertain); product-taste-as-bottleneck (true at small Anthropic-style teams, scaling unclear).
- Open questions: How much of Anthropic's cadence is process vs talent density? Does engineer-PM convergence scale beyond ~50-person teams? How reliable are 4.7-class introspection reports? When does a stronger model render the harness unnecessary entirely vs requiring different harness?
The wiki source set leans heavily on Anthropic's own narrative and one independent practitioner (Matt Pocock). Treat as well-grounded for individual workflow guidance, less battle-tested for organization-scale deployment.
Sources#
- Engineer PM Convergence — roles merging at Anthropic; product taste as bottleneck
- Printing Press Software Democratization — Boris Cherny's macro analogy
- Harness Shrinkage as Models Improve — pruning at every launch; build for next model
- Agent Loop Pattern —
/loop, Ralph loop, Sandcastle; AFK vs human-in-loop - Context Window Smart Zone — quadratic attention; 100K marker; clear-and-restart
- Vertical Slice Tracer Bullets — vertical > horizontal; Kanban over phase plans
- Design Concept Grilling —
grill-me; alignment before artifact - Deep Modules for Agents — Ousterhout for agent codebases; push vs pull
- Model Introspection Feedback — ask the model why it failed
- AI Native Product Cadence — 6mo→1mo→1day; mission as tiebreaker
- Claude Character as Product — character work; vibe-check eval discipline
- Claude Code Best Practices — explore→plan→code, environment config, scaling
- Agent Harness Engineering — invariants not implementations; AGENTS.md as ToC
- Seven Powers Applied to AI — which moats survive AI; counter-positioning amplified
Raw documents#
- Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next
- How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code)
- Full Walkthrough: Workflow for AI Coding — Matt Pocock
- Best Practices for Claude Code
- Effective harnesses for long-running agents
- Harness engineering: leveraging Codex in an agent-first world
7 articles link here
- ConceptAI Native Product Cadence
Cat Wu's 6mo→1mo→1day cadence at Anthropic: research-preview branding, mission-as-tiebreaker, evergreen launch room, li…
- EssayOpinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- ConceptClaude Code Best Practices
Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…
- ConceptEngineer PM Convergence
Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…
- ConceptHarness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- ConceptPrinting Press Software Democratization
Boris Cherny's analogy: 1400s literacy expansion → AI software-writing expansion; domain knowledge displaces coding ski…
- ConceptSeven Powers Applied to AI
Helmer/Acquired framework re-evaluated for AI: switching costs and process power erode; network effects, scale, cornere…
Related articles
- EssayOpinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- EntityClaude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- EntityBoris Cherny
Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…
- ConceptHarness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- ConceptAgent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
