Plate IISuperintelligence Trajectory中文HOWARDISM

Recursive Self-Improvement

PublishedJune 7, 2026FiledConceptDomainSuperintelligence TrajectoryTagsGovernance Recursive Self Improvement AI Rd Capability Trajectory AnthropicReading11 minSourceAI-synthesised

An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* argues AI is already accelerating AI development (engineers ship ~8× more code/quarter) and lays out three futures — stalled-but-diffused, compounding-efficiency, and full RSI

Illustration for Recursive Self-Improvement

Sources#

Summary#

Recursive self-improvement (RSI) is the point at which an AI system can fully autonomously design and develop its own successor — closing the loop so that each model is improved by the previous model rather than by humans. The Anthropic Institute essay When AI builds itself (Marina Favaro & Jack Clark, June 2026) is this wiki's primary source. Its argument has two halves: (1) a present-tense empirical claim that AI is already accelerating the development of AI (AI Accelerating AI Development — e.g. Anthropic engineers ship ~8× more code per quarter than in 2021–2025), and (2) an extrapolation that the trend "points to an AI system capable of fully autonomously designing and developing its own successor." Anthropic's stated position: "We are not there yet, and recursive self-improvement is not inevitable. But it could come sooner than most institutions are prepared for."

This page is the hub for the RSI cluster — the trajectory, the futures, and the governance response. The measured evidence lives in AI Accelerating AI Development; the capability-gating eval in AI R&D Autonomy Evaluation (AECI); the deployment brake in Responsible Scaling Policy Evaluations; and the coordination problem in Frontier Pause Verification.

Closing the loop#

The essay frames RSI as the endpoint of a steadily-tightening development loop, illustrated as person → computer → chatbot → agent → workers (each stage delegates more of the work to AI):

2021–2023 — Building the first Claude. Humans write code and docs on laptops; AI is absent from the loop.
2023–2025 — Chatbots. People paste model-generated snippets into editors.
2025–2026 — Coding agents. Agents write and edit whole files on their own (Claude Code launches Feb 2025).
Today — Autonomous agents. Agents run their own code and delegate hours of work to other agents (the loop primitive running unattended).
20XX? — Closing the loop. "Agents could become capable enough to build and train models themselves. If this happens, future versions of Claude could be continuously improved by Claude itself." This last step is RSI.

"What if we're wrong?" — why direction-setting may not save us#

The natural objection: the work still in human hands — choosing which problems to work on (Research Taste as the Human Bottleneck) — is what matters most, so AI remains a capable assistant, not an autonomous driver of progress. The essay offers two rebuttals:

Perspiration is becoming automated. AI advances rarely come from "eureka" moments; paradigm shifts (the Transformer, mixture-of-experts) "arrive years apart." In between, "most progress is incremental: we scale something up, see what breaks, fix it, and try again" — exactly the workflow Claude now excels at. Edison's "1% inspiration, 99% perspiration" is invoked: "we see perspiration becoming increasingly automated." Large-scale research progress "is mostly a function of tools and resources" — how fast and how many experiments you can run — which is the bitter lesson pushed to its limit.
A conservative reading still compounds. Even if Claude never gets research taste, if humans spend most of their time on the single-digit fraction of work that is direction-setting while Claude handles the rest, each human steers far more work than before. "AI already makes Anthropic move much faster than it did before."
The less-conservative reading. The early evidence of improving research judgment (51%→64% on next-step decisions; see AI Accelerating AI Development) suggests taste "might be just another AI capability that AI systems fail at for a time, then get good at" — the same pattern seen with explaining why a joke is funny, theory of mind, and linguistic riddles (Jagged Intelligence (Ghosts, Not Animals)).

Three possible futures#

The essay lays out three scenarios for "what happens next," contingent on whether the trend continues and what we choose to do:

The trend stalls (S-curve), but today's capabilities diffuse widely. Exponentials bend; the judgment separating a competent researcher from a great one may not come from scaling compute/data, requiring a new architecture past the Transformer — or the binding constraint may be the supply chain (energy, chip fab, grid, interconnect) rather than intelligence. Even frozen at today's capability, the world changes: Project Glasswing already shifted the cyber bottleneck from finding to patching (LLM-Driven Vulnerability Research), and a 100-person company can increasingly do the work of a 1,000-person one (AI-Native Startup Lifecycle). Anthropic thinks this is unlikely — "we have not yet seen that curve bend."
Compounding efficiency gains; humans still set direction. AI development becomes substantially automated but humans judge results. 100-person companies do the work of 10,000–100,000; revolutionizes knowledge work and government — but could power authoritarian surveillance or individualized influence ops at superhuman scale. The essay says the evidence suggests this is the likely path — bounded by Amdahl's law (below).
Full RSI — AI builds its successors. Pace becomes determined entirely by compute (and algorithmic-efficiency discoveries). Humans move "most of our effort towards oversight, validation, and verification of an expanding 'virtual lab' run by AI systems," with skills transferring to the rest of science. How the alignment problem resolves here is what Anthropic is "least certain about": models may be aligned and wise enough to find novel solutions (or to halt), or "the rare occurrences of misalignment present in today's models could compound as the models build their successors, growing more frequent but less understood until we lose control."

Amdahl's law for organizations#

A recurring brake across futures 2–3: speeding up one part of a process just shifts the bottleneck elsewhere; overall pace is capped by the parts that haven't sped up (Amdahl's law). Anthropic has already hit its signature: as more code flows through the org, human code review became the new bottleneck — the org-level instance of Verification as the New Bottleneck. The same friction appears beyond engineering: an explosion of ideas/initiatives/tools "far more than we have the capacity to pursue." Spotting and clearing these bottlenecks "may become the most important skill for any organization." This is also why "the felt pace of this future will still be set by the bottlenecks" — RSI can't run clinical trials faster than biology, hold elections sooner than constitutions allow, or turn a stranger into an old friend in a weekend.

An external practitioner reaches the same brake by a different route. Noam Brown (OpenAI, practitioner-opinion) argues an overnight intelligence explosion is unlikely precisely because peak capability requires large-scale test-time compute — runs that take weeks or months — so time itself becomes the binding constraint and the realistic shape is a "gradual takeoff," not an instant one. It is the Amdahl's-law point made about inference duration rather than org throughput; the full argument sits in Intelligence Explosion Dynamics.

What should we do? (the governance response)#

Anthropic argues it would "likely be a good thing" to have the option to slow or pause frontier development so societal structures and alignment research can keep up — but a unilateral pause merely changes who leads, and a real one requires multilateral, verifiable coordination. Building the systems that make a credible pause possible is the subject of Frontier Pause Verification and the Anthropic Institute's agenda. "The window to investigate the questions together is here, and people outside AI companies should be involved."

Connections#

AI Accelerating AI Development — the measured, present-tense evidence half of the essay; the data behind "the loop is tightening"
AI R&D Autonomy Evaluation (AECI) — the capability-side gate: AECI and the substitution threshold are how Anthropic measures "can the model build the next model?"
Responsible Scaling Policy Evaluations — the deployment brake; the RSP AI-R&D threat model is RSI risk made operational
Research Taste as the Human Bottleneck — the last human comparative advantage; whether it holds determines which of the three futures obtains
Task Time-Horizon Scaling — the external trendline (METR doubling every ~4 months) that makes the extrapolation quantitative
Frontier Pause Verification — the governance response: building the verification regime a credible slowdown would require
The Bitter Lesson — "perspiration is automatable" is the bitter lesson applied to research itself; RSI is its furthest extrapolation
Agentic Loops Overtake Bespoke Systems — RSI's clearest existing-domain proxy: a simple loop matched a bespoke trained system as the model improved
Harness Shrinkage as Models Improve — the same human-role-narrowing dynamic; humans stop writing code and shift to review
Verification as the New Bottleneck — Amdahl's law instantiated: review becomes the binding constraint as generation accelerates
Agentic Misalignment (AM) — the failure mode that could compound through self-improvement: misalignment growing "more frequent but less understood"
Jagged Intelligence (Ghosts, Not Animals) — the "taste is just another capability AI masters" argument rests on the joke/theory-of-mind precedent
LLM-Driven Vulnerability Research — Glasswing is the essay's proof that even frozen capability reshapes the world
Autonomous Scientific Discovery — June 2026 wet-lab evidence that "perspiration is becoming automated" reaches discovery itself (the futures-2/3 case): autonomous drug design, novel hypotheses, week-long genomics
AI-Native Startup Lifecycle — the diffusion scenario: each employee atop a pyramid of agents; 100-person firms doing 1,000-person work
AGI-to-ASI Pathways — DeepMind's "From AGI to ASI" report makes RSI its pathway 3; the theory-first sibling treatment to Anthropic's empirical essay
Intelligence Explosion Dynamics — the growth-curve question (exponential vs. hyperbolic/singularity vs. S-curve) and the four RSI mechanisms (genetic, cultural, cooperative, data), from the DeepMind report
Multi-Agent Collective Intelligence — cooperative/sociogenic RSI: specialization in agent collectives freeing resources for further specialization
Large-Scale Test-Time Compute — Brown's test-time-compute pacing argument: peak capability needs long runs, so time bounds the takeoff and an overnight explosion is unlikely (developed in Intelligence Explosion Dynamics)
Researcher Uplift from Code Output — a near-term marker on this trajectory: METR's Kwa back-solves ~2.5× serial researcher uplift from Anthropic's 8×-code figure and estimates Anthropic's own 2× overall R&D threshold trips at ~3.5× researcher uplift, "which could happen in the next year or so"

Open Questions#

Is "research taste" a true ceiling (future 1) or just the next capability to fall (futures 2–3)? The essay frames this as the single load-bearing uncertainty.
The RSI extrapolation rests on trends staying exponential rather than S-curving — but the essay concedes it cannot rule out an architectural ceiling or a compute/energy supply-chain constraint. Which binds first? Synthesized against DeepMind: RSI Growth Curves: Which Friction Binds First? — the three futures map one-to-one onto DeepMind's three growth shapes; the first friction to bind is the already-binding one (Amdahl's-law verification/oversight = DeepMind's embodied bottleneck), and the abstraction barrier supplies the mechanism Anthropic lacks for whether taste is a real ceiling (Future 1).
If misalignment compounds through self-improvement (future 3), is AECI-gated RSP review fast enough to catch it before control is lost?

Sources#

When AI builds itself — Anthropic Institute, When AI builds itself: Our progress toward recursive self-improvement, and its implications (Marina Favaro & Jack Clark, June 2026)
Really Big Test-Time Compute in AI Changes Benchmarks, Safety and Research with OpenAI's Noam Brown — Noam Brown (No Priors, 2026-06-26), practitioner-opinion: overnight explosion unlikely because test-time-compute dependence makes time the binding constraint ("gradual takeoff")

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 29

Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
AGI-to-ASI Pathways
DeepMind's four non-exclusive, parallel technological routes from human-level AGI to superintelligence — scaling, algor…
AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Anthropic Institute
Anthropic's policy/governance research arm; published *When AI builds itself* (Favaro & Clark, 2026) on recursive self-…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Build for the Next Model
Prototype the thing that almost works, not the thing that already works: bet that the next concrete model release (not…
Compute Allocator
The human's evolving role: deciding what's worth spending compute on; ~1% of generated tokens ship, 99% is scaffolding…
Frontier Pause Verification
The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…
Google DeepMind
Google's AI lab; built AlphaProof Nexus; Gemini models, AlphaProof, AlphaEvolve, and the open-weight Gemma line; opens…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Instrumental Convergence
Omohundro/Bostrom's thesis that whatever an AI's final goal, it tends to pursue universally useful sub-goals — resource…
Intelligence Explosion Dynamics
The growth-curve question behind recursive self-improvement: whether AI-accelerating-AI produces exponential, super-exp…
Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
Superintelligence Trajectory
Map of Content for the superintelligence-trajectory domain — 20 concepts. The path from AGI to ASI: recursive self-impr…
Multi-Agent Collective Intelligence
DeepMind's fourth pathway to ASI: superintelligence as an emergent property of many coordinated AGI agents — group agen…
Open Questions Backlog
_396 actionable open questions across 155 pages · 79 predictions · 9 notes · 21 in progress · 59 watching (entities), a…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Researcher Uplift from Code Output
Thomas Kwa (METR) translates Anthropic's reported 8× code-per-engineer-per-day into serial researcher uplift with produ…
Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
RSI Growth Curves: Which Friction Binds First?
DeepMind's exponential/hyperbolic/S-curve growth shapes are Anthropic's compounding-efficiency/full-RSI/stalled futures…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…

AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Open Questions Backlog
_396 actionable open questions across 155 pages · 79 predictions · 9 notes · 21 in progress · 59 watching (entities), a…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…

AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Open Questions Backlog
_396 actionable open questions across 155 pages · 79 predictions · 9 notes · 21 in progress · 59 watching (entities), a…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…

Cited by 29

Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
AGI-to-ASI Pathways
DeepMind's four non-exclusive, parallel technological routes from human-level AGI to superintelligence — scaling, algor…
AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Anthropic Institute
Anthropic's policy/governance research arm; published *When AI builds itself* (Favaro & Clark, 2026) on recursive self-…
Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
Build for the Next Model
Prototype the thing that almost works, not the thing that already works: bet that the next concrete model release (not…
Compute Allocator
The human's evolving role: deciding what's worth spending compute on; ~1% of generated tokens ship, 99% is scaffolding…
Frontier Pause Verification
The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…
Google DeepMind
Google's AI lab; built AlphaProof Nexus; Gemini models, AlphaProof, AlphaEvolve, and the open-weight Gemma line; opens…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Instrumental Convergence
Omohundro/Bostrom's thesis that whatever an AI's final goal, it tends to pursue universally useful sub-goals — resource…
Intelligence Explosion Dynamics
The growth-curve question behind recursive self-improvement: whether AI-accelerating-AI produces exponential, super-exp…
Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
Superintelligence Trajectory
Map of Content for the superintelligence-trajectory domain — 20 concepts. The path from AGI to ASI: recursive self-impr…
Multi-Agent Collective Intelligence
DeepMind's fourth pathway to ASI: superintelligence as an emergent property of many coordinated AGI agents — group agen…
Open Questions Backlog
_396 actionable open questions across 155 pages · 79 predictions · 9 notes · 21 in progress · 59 watching (entities), a…
Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
Researcher Uplift from Code Output
Thomas Kwa (METR) translates Anthropic's reported 8× code-per-engineer-per-day into serial researcher uplift with produ…
Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
RSI Growth Curves: Which Friction Binds First?
DeepMind's exponential/hyperbolic/S-curve growth shapes are Anthropic's compounding-efficiency/full-RSI/stalled futures…
Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…