資料來源#
- An open-source spec for Codex orchestration: Symphony.
- Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next
- Auto mode for Claude Code
- Best Practices for Claude Code
- Full Walkthrough: Workflow for AI Coding — Matt Pocock
- How Anthropic's product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code)
- Introducing Claude Opus 4.7
- Tips & Best Practices
- Tutorial: Team Telegram Assistant
摘要#
Anthropic 關於有效使用 Claude Code 的官方指南,圍繞單一核心限制組織:context window 很快會填滿,而且越滿效能越差。所有 best practices 都來自對這個稀缺資源的管理 — 透過 verification-driven development、結構化 context(CLAUDE.md)、積極的 session 管理,以及透過 parallel sessions 做水平 scaling。
細節#
Context Window 作為主要限制#
context window 保存整段對話:messages、file reads、command outputs。單次 debugging session 就可能消耗數萬 token。隨著 context 填滿,Claude 會「忘記」較早的指令,也會犯更多錯。每一條 best practice 最終都是在管理這項資源。底層機制見 Context Window Smart Zone(quadratic attention scaling、約 100K-token smart-zone marker)。
Model-level amplifiers(截至 Claude Opus 4.7):更新後的 tokenizer 會把同一份 input 映射成 1.0–1.35× 更多 token,而 Opus 4.7 「在更高 effort levels 會思考更多」— 尤其是在 agentic 設定裡較後面的 turns。Claude Code 的預設 effort 已提高到 xhigh。這些因素會疊加:在 4.6 的 high 還放得下的 session,在 4.7 的 xhigh 可能明顯更緊。不要相信從 4.6 沿用過來的直覺,先在真實流量上量測。反制槓桿:降低 effort、task budgets(API)、明確的簡潔 prompting,或 brevity-style output caps(見 Scale-Dependent Prompt Sensitivity)。
Verification-Driven Development#
單一最高槓桿的做法:給 Claude 一種驗證自己工作的方式。提供 tests、screenshots、expected outputs,或 linter commands。沒有 verification,Claude 會產出看似合理但壞掉的 code,而 human 會變成唯一的 feedback loop。
關鍵 patterns:
- 提供具體 test cases,包含 inputs 與 expected outputs
- 對 UI changes,貼上 screenshots,並要求 Claude 比對自己的結果
- 透過提供 error messages 來處理 root causes,而不只是說「the build is failing」
- 使用 Claude in Chrome extension 做自動化 UI testing
Explore → Plan → Code Workflow#
把 research 和 implementation 分開。對 multi-file changes 或不熟悉的 code 使用 Plan Mode。當 scope 清楚、diff 可以用一句話描述時,就跳過 planning。
更激進的變體是:Design Concept Grilling(Matt Pocock 的 grill-me skill)把「請 agent 給 plan」換成「在任何 plan 存在之前,讓 agent interview 你,直到達成 shared understanding」。另見 Vertical Slice Tracer Bullets,了解如何把產生的 PRD 切成 agent-grabbable Kanban tickets;以及 Deep Modules for Agents,了解如何讓 codebase shape 對 agent-friendly。
Environment Configuration#
- CLAUDE.md:每個 session 都會載入的 persistent instructions。只放 Claude 無法從 code 推斷出的內容 — bash commands、非預設 code style、workflow rules、architectural decisions、gotchas。要狠心修剪:如果 Claude 在沒有這條 instruction 時已經做對,就刪掉。把它當 code — 出事時 review,透過觀察 behavior changes 來 test。用
@pathimports 做 modularity。對 founders / solo builders 而言,更嚴格的 discipline 是每個 session 都以 CLAUDE.md 作為 architectural context 開始,並在每個 session 結束時更新它;這是抵禦 Agentic Technical Debt 的主要防線 — 這種 debt 會 compounds(不只是 accumulates),因為當 context 沒有被持久化,每個 session 都會重新推導 foundational decisions。 - Skills(
.claude/skills/):domain knowledge 和 reusable workflows,按需載入,而不是每個 session 都載入。用/skill-name叫用。 - Subagents(
.claude/agents/):在 isolated context 中執行、搭配 scoped tools 的 specialized assistants。適合需要讀很多 files、但不想弄亂 main context 的 tasks。 - Hooks:在 Claude 的 workflow 特定點執行的 deterministic scripts。不同於 CLAUDE.md(advisory),hooks 保證會執行。
- MCP servers:透過
claude mcp add連接 external tools(Notion、Figma、databases)。 - Plugins:來自 marketplace 的 bundled skills + hooks + subagents + MCP。
- Permissions:auto mode(classifier-based approval,介於 default-prompt 和
--dangerously-skip-permissions之間的中間地帶)、allowlists,或 OS-level sandboxing。
Session Management#
/clear用在不相關 tasks 之間 — 防止 context pollution/compact <instructions>— targeted summarization,保留指定 context/rewindorEsc+Esc— 將 conversation、code 或兩者恢復到任一 checkpoint- Subagents for investigation — 在 separate context 中 explore,回報 summaries
/btw— 永遠不進入 conversation history 的 side questions- 同一個 issue 連續修正失敗兩次後,
/clear並把學到的東西寫進 prompt 後重寫
Scaling Patterns#
- Non-interactive mode:
claude -p "prompt",用於 CI、scripts、pre-commit hooks。支援 JSON 和 streaming output。 - Parallel sessions:desktop app(isolated worktrees)、web(isolated VMs),或 agent teams(以 shared tasks 協調的 sessions)。
- Writer/Reviewer pattern:一個 session implementation,另一個用 fresh context review(不偏袒自己的 code)。
- Fan-out:對 large migrations,用 loop 對 files 跑
claude -p。用--allowedTools限定 permissions。 - Auto mode for unattended runs:classifier 擋下 risky actions,允許 routine work。在 non-interactive mode 中遇到重複 block 會 abort。
- Loops and routines:
/loop(cron-scheduled repeat job,in-CLI)和 routines(server-side variant)。AFK 清空 Kanban backlog;主要機制是把 planning 成本攤銷到多次 executions。見 Agent Loop Pattern。
Parallel Ecosystems 與 Cross-Tool Concept Mapping#
Claude Code 是幾個正在匯流的 coding-agent ecosystems 之一。與 Hermes Agent(Nous Research)和 Codex(OpenAI)的 capability parallels:
| Capability | Claude Code | Hermes | Codex |
|---|---|---|---|
| Project context file | CLAUDE.md | AGENTS.md (project) + SOUL.md (personality, separate) | AGENTS.md |
| Session compaction | /compact <instructions> | /compress | (via Codex App Server thread compaction) |
| Mid-session model switch | /model | /model | session-level config |
| Parallel subagents | Subagents in .claude/agents/ | delegate_task | Spawned via Symphony orchestrator |
| Non-interactive / programmatic | claude -p, Claude Agent SDK | hermes CLI in scripts | Codex App Server (JSON-RPC stdio) |
| Multi-user team deployment | per-session claude -p | Hermes Gateway (Telegram/Discord/Slack/WhatsApp) with allowlist or DM pairing | Symphony (issue-tracker-driven daemon) |
| Permission gating | auto mode classifier | per-pattern approvals (once/session/always/deny); skipped under container backend | implementation-defined per Symphony spec |
| Memory model | conversation + CLAUDE.md | bounded MEMORY.md (~2,200 chars) + USER.md (~1,375 chars) | filesystem-driven |
三者共有的 structural insight:agent behavior 是透過 repo-versioned markdown files 設定的(CLAUDE.md / AGENTS.md / SOUL.md / WORKFLOW.md)。這個 pattern 在 vendors 之間已經一致到像是 emerging standard。(計畫用一個專門的 Agent Context Files concept page 來 formalize 這點。)
最大的 architectural divergence:Claude Code 是 session-first,可選 non-interactive mode;Hermes Gateway 和 Symphony 在 team scale 部署時是 daemon-first。session-vs-daemon split 是 2026 年主要的 deployment-architecture choice。
常見 Failure Patterns#
| Pattern | Fix |
|---|---|
| Kitchen sink session(混合不相關 tasks) | tasks 之間使用 /clear |
| Repeated corrections(>2 次 failed fixes) | /clear,用 lessons learned 重寫 prompt |
| Over-specified CLAUDE.md | 修剪;如果是 deterministic 就轉成 hooks |
| Trust-then-verify gap | 永遠提供 verification criteria |
| Infinite exploration | 縮窄 scope,或使用 subagents |
相關連結#
- Agent Harness Engineering — Claude Code 的 CLAUDE.md、skills 和 hooks,是 OpenAI 與 Anthropic research teams 所描述的 harness engineering patterns 的實務 implementation
- LLM-as-Compiler Knowledge Base — CLAUDE.md files 在這個 vault 的 LLM-as-compiler architecture 中扮演 schema layer
- LLM-Driven Vulnerability Research — Claude Code 是 Anthropic vulnerability research scaffold 的 runtime;所有 Mythos Preview findings 都使用了 Claude Code 的 agentic capabilities
- Client-Side Agent Optimization — 直接挑戰「使用 strongest model」這個 default:Claude Opus 4.6 搭配 cheaper planner 的 combinations,在 HotpotQA 上比 all-Opus 高出 >40pp。AgentOpt 的 httpx interception 相容於
claude -pnon-interactive mode - Scale-Dependent Prompt Sensitivity — 補足 context-window management:brevity constraints 既能提高 overthinking-prone problems 的 accuracy,也能保留 context budget。當 large-model verbosity 可能掩蓋 reasoning errors 時,Verification-driven development 特別重要
- Claude Code Auto Mode — Environment Configuration 和 Scaling Patterns 中提到的 "auto mode" permission option 完整說明
- Claude Opus 4.7 — 目前多數 Claude Code work target 的 model;literal instruction following 和 tokenizer inflation 直接重塑 CLAUDE.md 與 session management 的寫法
- Hermes Agent — 來自 Nous Research 的 parallel ecosystem;許多 Claude Code patterns 可直接映射(
/compress↔/compact、delegate_task↔ subagents、AGENTS.md↔CLAUDE.md);差異(Gateway daemon、bounded memory files、SOUL.mdsplit)凸顯各自的 design choices - Codex App Server Protocol — OpenAI-side 對應
claude -p+ Claude Agent SDK 的 analog;兩者都讓 external orchestrator drive sessions,但 App Server 對 stable JSON-RPC stdio protocol 更明確 - Symphony — daemon-first deployment archetype;Claude-Code analog 會把
claude -p加上 subagents 接進 issue tracker,就像 Symphony 把 Codex 接到 Linear - Ticket-Driven Agent Orchestration — 一旦 non-interactive mode 穩固就會自然出現的 orchestration pattern;把 single-session best practices 接到 team-scale deployment
- Context Window Smart Zone — 驅動本文每個 context-management practice 的底層限制
- Design Concept Grilling — explore→plan→code 的更激進 alignment-first 變體
- Vertical Slice Tracer Bullets — task decomposition pattern,用來填充由 loop primitive 清空的 Kanban backlog
- Deep Modules for Agents — 讓 Claude Code review 和 verification patterns 可靠的 codebase shape;push-vs-pull instruction delivery
- Agent Loop Pattern —
/loop和 routines 作為取代 per-step prompting 的下一代 primitive - Harness Shrinkage as Models Improve — 為什麼 best-practice prompts 和 CLAUDE.md sections 會隨每次 model release 變短;Cat Wu 在每次 launch 都修剪的 discipline
- Claude Code — entity-level page
- AI Native Product Cadence — 這些 best-practice artifacts 是一個以該 internal cadence 運作的 team 對外輸出
- Engineer PM Convergence — 這份 guide 隱含瞄準的 engineer-with-product-taste persona
- Agentic Technical Debt — CLAUDE.md 主要防禦的 failure mode;在 the founder's playbook 中被明確點名
- AI-Native Startup Lifecycle — founder-stage framing,將 CLAUDE.md 從「best practice」提升為「MVP survival discipline」
- MCP and Computer Use — 「extend Claude Code with custom tools」scaling pattern 背後的 connector substrate;MCP 和 computer use 是讓 external systems 成為 agent action surface 一部分的方式
- Evals as Product Spec — 「verification-driven development」的嚴格形式:十個優秀 evals 在 feature level 編碼 done looks like,補足本文規定的 workflow-level verification
衍生內容#
- When to Use Claude Opus 4.6 for Work — context-window-as-primary-constraint framing 指向 Claude Code corollary:Opus verbosity 會更快消耗 budget
- Opus 4.6 → 4.7 Changes and Multi-Agent Coding Considerations — 將 subagents、Writer/Reviewer 和 scaling-pattern guidance 套用到 Opus 4.7 multi-agent teams
- Learning to Co-Work with AI: A Software Engineer's Field Guide — 將 best-practices 萃取成 per-engineer skill-development field guide(六個 skill clusters、daily practices、anti-patterns、90-day plan)
開放問題#
- CLAUDE.md 的最佳長度是多少,才不會讓 instructions 開始遺失?有沒有可量測 threshold?
- Writer/Reviewer pattern 與 agent-to-agent review(如 OpenAI 的 Codex workflow)相比如何?
- 什麼時候 subagent overhead 會超過 context isolation 的 benefit?
資料來源#
- Best Practices for Claude Code
- Auto mode for Claude Code — permission-mode expansion
- Introducing Claude Opus 4.7 — tokenizer/xhigh-default context-budget implications
Cited by 38
- Agent Context Files
The cross-vendor markdown-as-control-plane pattern: repo-versioned plaintext (CLAUDE.md / AGENTS.md / SOUL.md / WORKFLO…
- Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files
Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
- AI Native Product Cadence
Cat Wu's 6mo→1mo→1day cadence at Anthropic: research-preview branding, mission-as-tiebreaker, evergreen launch room, li…
- AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
- Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
- Blast Radius (Agentic)
The potential damage if an agent is compromised; the unit Zero Trust's 'assume breach' posture is built to contain via…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Claude Code Auto Mode
Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…
- Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
- Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
- Code as Source of Truth
Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification
- Codex App Server Protocol
JSON-RPC stdio protocol for headless Codex sessions: initialize/initialized/thread-start/turn-start handshake, continua…
- Deep Modules for Agents
Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…
- Design Concept Grilling
Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…
- Where Does Agent Harness Work Remain Durable as Models Improve?
Durable harness work lives at external-reality boundaries: repo-local source of truth, mechanical verification, context…
- Engineer PM Convergence
Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…
- Evals as Product Spec
Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…
- Hermes Agent
Nous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded…
- Learning to Co-Work with AI: A Software Engineer's Field Guide
Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…
- Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
- LLM-as-Compiler Knowledge Base
Karpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4…
- LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
- MCP and Computer Use
Anthropic's two complementary connector mechanisms: MCP for structured programmatic access (Salesforce/Drive/Gmail/Slac…
- AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Opus 4.6 → 4.7 Changes and Multi-Agent Coding Considerations
4.6→4.7 delta table + six hazards for multi-agent coding teams: role-based model selection, prompt re-tuning, harness i…
- Orchestration vs Employee Framing: Reconciling the Founder's Playbook with HBR's Accountability Evidence
Reconciles the Founder's Playbook orchestration framings with HBR Kropp et al.'s accountability evidence; "orchestratio…
- Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
- Symphony
OpenAI's open-source agent orchestrator (March 2026): turns Linear into a control plane for Codex, per-issue workspace,…
- Ticket-Driven Agent Orchestration
The inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible…
- The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
- Vertical Slice Tracer Bullets
Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…
- Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…
- When to Use Claude Opus 4.6 for Work
Decision rules for Opus 4.6 deployment: solver-not-planner, elaboration-load-bearing tasks, brevity constraints, Pareto…
- Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…
Related articles
- Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
