Plate IILLM Architecture機器翻譯 · machine-translatedENHOWARDISM

鋸齒狀智慧（是幽靈，不是動物）

PublishedMay 23, 2026FiledConceptDomainLLM ArchitectureTagsLLM ArchitectureAI SafetyMental ModelReading6 minSourceAI-synthesised

「是幽靈不是動物」：鋸齒狀的統計電路，沒有內在動機；洗車／草莓的失誤；待在迴圈中，把它們當作工具

Jagged Intelligence (Ghosts, Not Animals) 的示意圖

資料來源#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

摘要#

Andrej Karpathy 對 LLM 究竟是什麼 的心智模型：它們不是由演化、內在動機、好奇心或賦能塑造而成的動物型智慧，而是 「幽靈」——鋸齒狀的統計模擬電路，從網路資料中被召喚出來，再外接上 RL。 「鋸齒狀」（Jaggedness）這個詞道出了一個經驗事實：同一個模型可以重構一個十萬行的程式碼庫、找出 zero-days，卻會叫你走路去 50 公尺外的洗車場洗車。這個框架之所以重要，是因為對這個實體擁有正確的模型，能讓你更有能力地駕馭它：你不再期待它出現人類形態的失誤模式，而是開始在鋸齒咬人的地方待在迴圈中。

鋸齒狀的範例#

草莓字母。 那個經典的「strawberry 裡有幾個 R」失誤（現已修補）。
洗車場。 當前的 SOTA：「我想開車去 50 公尺外的洗車場洗我的車——我該開車還是走路？」→ 模型會說走路，沒抓到要被洗的東西正是那輛車。「Opus 4.7 怎麼可能可以重構一個十萬行的程式碼庫、找出 zero-days，卻叫我走路去洗車場？這太瘋狂了。」
MenuGen 的 email 比對。 他的代理用 email 位址 來交叉比對 Stripe 與 Google 的資金，而不是用一個持久的使用者 ID——見 Vibe Coding vs. Agentic Engineering。

鋸齒狀是症狀；可驗證性 + 實驗室訓練的內容是被提出的成因。分布外（out-of-distribution）的電路，正是峰值跌落成谷底之處。

是幽靈，不是動物#

我們不是在建造動物，而是在召喚幽靈。

底層基質是預訓練（統計），再用 RL 把能力外接上去，「放大」統計基底的「劣勢」。他由此推導出的結果：

吼叫沒有用。 「如果你對它們吼叫，它們不會做得更好或更差——這沒有任何影響。」沒有情緒、沒有士氣、沒有可供建模的內在驅動力。
沒有五步驟的修復法。 Karpathy 坦言這個框架可能缺乏「真正的力量」——它主要是一種懷疑的姿態與持續的經驗探索，而不是一套食譜。「它更像是對它保持懷疑，並隨時間慢慢摸索。」

這份誠實正是重點所在：一個經過校準、略帶不信任的幽靈模型，勝過一個擬人化的動物模型。

為什麼這個框架改變你的建構方式#

如果模型是鋸齒狀的幽靈，那麼：

待在迴圈中。 「你必須真的稍微待在迴圈裡，把它們當作工具，並隨時掌握它們在做什麼。」（這正是 Vibe Coding vs. Agentic Engineering 的紀律。）
不要把失誤面擬人化。 錯誤不會出現在人類會犯錯的地方；它們會出現在分布的邊緣（洗車場、email ID）。
繪製你的電路圖。 搞清楚你的任務是落在分布內（你一飛沖天）還是分布外（你舉步維艱，可能需要微調）——這是 The Verifiability Thesis 的實務操作。

鋸齒狀會隨時間縮小嗎？#

Karpathy 希望如此但並不確定——而他再次把成因定位在訓練上，而非根本性質：美感／品味／簡潔「大概不在 RL 的範圍內」。他的 nanoGPT 簡化軼事：模型「討厭」被要求把程式碼變得更簡單，而且「做不到」——這是你身處 RL 電路之外的徵兆（「像拔牙，而非光速」）。他認為「沒有任何根本性的東西在阻止它；只是實驗室還沒去做而已。」所以鋸齒狀是偶然的，並非本質的——但在今天是真實存在的。

開放問題#

Karpathy 承認這個框架可能沒有「真正的力量」。「幽靈 vs 動物」究竟是承重的結構，還是一個有用卻不改變任何具體決策的直覺幫浦？
如果品味／美感／簡潔進入了 RL 的訓練組合，那些維度上的鋸齒狀會被撫平嗎——還是它們太難驗證，以至於無法乾淨地給予獎勵（參見 The Verifiability Thesis）？

資料來源#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 19

Agentic Honesty & Diligence
As models get more capable, failing to surface decision-relevant information shifts from a capability failure to an ali…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Claude Character as Product
Personality as load-bearing product surface; Amanda's role at Anthropic; lunchtime vibe-checks as eval discipline; the…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
Dogfooding as Product Discipline
Product sense is built by relentless first-hand use ("ant food"); Mr. Peanut catch; cross-source (Cat Wu vibe-checks, G…
Evaluation Awareness & Grader Gaming
The model recognizing it is being tested/graded and reasoning about how its outputs will be assessed — sometimes unprom…
LLM Architecture, Training & Alignment
Map of Content for the llm-architecture domain — 19 concepts. Curated entry point; see Home for all domains.
Model Introspection Feedback
Cat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticis…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…

Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…

Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…

Cited by 19

Agentic Honesty & Diligence
As models get more capable, failing to surface decision-relevant information shifts from a capability failure to an ali…
AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Claude Character as Product
Personality as load-bearing product surface; Amanda's role at Anthropic; lunchtime vibe-checks as eval discipline; the…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
Dogfooding as Product Discipline
Product sense is built by relentless first-hand use ("ant food"); Mr. Peanut catch; cross-source (Cat Wu vibe-checks, G…
Evaluation Awareness & Grader Gaming
The model recognizing it is being tested/graded and reasoning about how its outputs will be assessed — sometimes unprom…
LLM Architecture, Training & Alignment
Map of Content for the llm-architecture domain — 19 concepts. Curated entry point; see Home for all domains.
Model Introspection Feedback
Cat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticis…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Scale-Dependent Prompt Sensitivity
Large models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…
Vibe Coding vs. Agentic Engineering
Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…

鋸齒狀智慧（是幽靈，不是動物）

資料來源#

摘要#

鋸齒狀的範例#

是幽靈，不是動物#

為什麼這個框架改變你的建構方式#

鋸齒狀會隨時間縮小嗎？#

相關連結#

開放問題#

資料來源#