Sources#
Summary#
Recursive self-improvement (RSI) is the point at which an AI system can fully autonomously design and develop its own successor — closing the loop so that each model is improved by the previous model rather than by humans. The Anthropic Institute essay When AI builds itself (Marina Favaro & Jack Clark, June 2026) is this wiki's primary source. Its argument has two halves: (1) a present-tense empirical claim that AI is already accelerating the development of AI (AI Accelerating AI Development — e.g. Anthropic engineers ship ~8× more code per quarter than in 2021–2025), and (2) an extrapolation that the trend "points to an AI system capable of fully autonomously designing and developing its own successor." Anthropic's stated position: "We are not there yet, and recursive self-improvement is not inevitable. But it could come sooner than most institutions are prepared for."
This page is the hub for the RSI cluster — the trajectory, the futures, and the governance response. The measured evidence lives in AI Accelerating AI Development; the capability-gating eval in AI R&D Autonomy Evaluation (AECI); the deployment brake in Responsible Scaling Policy Evaluations; and the coordination problem in Frontier Pause Verification.
Closing the loop#
The essay frames RSI as the endpoint of a steadily-tightening development loop, illustrated as person → computer → chatbot → agent → workers (each stage delegates more of the work to AI):
- 2021–2023 — Building the first Claude. Humans write code and docs on laptops; AI is absent from the loop.
- 2023–2025 — Chatbots. People paste model-generated snippets into editors.
- 2025–2026 — Coding agents. Agents write and edit whole files on their own (Claude Code launches Feb 2025).
- Today — Autonomous agents. Agents run their own code and delegate hours of work to other agents (the loop primitive running unattended).
- 20XX? — Closing the loop. "Agents could become capable enough to build and train models themselves. If this happens, future versions of Claude could be continuously improved by Claude itself." This last step is RSI.
"What if we're wrong?" — why direction-setting may not save us#
The natural objection: the work still in human hands — choosing which problems to work on (Research Taste as the Human Bottleneck) — is what matters most, so AI remains a capable assistant, not an autonomous driver of progress. The essay offers two rebuttals:
- Perspiration is becoming automated. AI advances rarely come from "eureka" moments; paradigm shifts (the Transformer, mixture-of-experts) "arrive years apart." In between, "most progress is incremental: we scale something up, see what breaks, fix it, and try again" — exactly the workflow Claude now excels at. Edison's "1% inspiration, 99% perspiration" is invoked: "we see perspiration becoming increasingly automated." Large-scale research progress "is mostly a function of tools and resources" — how fast and how many experiments you can run — which is the bitter lesson pushed to its limit.
- A conservative reading still compounds. Even if Claude never gets research taste, if humans spend most of their time on the single-digit fraction of work that is direction-setting while Claude handles the rest, each human steers far more work than before. "AI already makes Anthropic move much faster than it did before."
- The less-conservative reading. The early evidence of improving research judgment (51%→64% on next-step decisions; see AI Accelerating AI Development) suggests taste "might be just another AI capability that AI systems fail at for a time, then get good at" — the same pattern seen with explaining why a joke is funny, theory of mind, and linguistic riddles (Jagged Intelligence (Ghosts, Not Animals)).
Three possible futures#
The essay lays out three scenarios for "what happens next," contingent on whether the trend continues and what we choose to do:
- The trend stalls (S-curve), but today's capabilities diffuse widely. Exponentials bend; the judgment separating a competent researcher from a great one may not come from scaling compute/data, requiring a new architecture past the Transformer — or the binding constraint may be the supply chain (energy, chip fab, grid, interconnect) rather than intelligence. Even frozen at today's capability, the world changes: Project Glasswing already shifted the cyber bottleneck from finding to patching (LLM-Driven Vulnerability Research), and a 100-person company can increasingly do the work of a 1,000-person one (AI-Native Startup Lifecycle). Anthropic thinks this is unlikely — "we have not yet seen that curve bend."
- Compounding efficiency gains; humans still set direction. AI development becomes substantially automated but humans judge results. 100-person companies do the work of 10,000–100,000; revolutionizes knowledge work and government — but could power authoritarian surveillance or individualized influence ops at superhuman scale. The essay says the evidence suggests this is the likely path — bounded by Amdahl's law (below).
- Full RSI — AI builds its successors. Pace becomes determined entirely by compute (and algorithmic-efficiency discoveries). Humans move "most of our effort towards oversight, validation, and verification of an expanding 'virtual lab' run by AI systems," with skills transferring to the rest of science. How the alignment problem resolves here is what Anthropic is "least certain about": models may be aligned and wise enough to find novel solutions (or to halt), or "the rare occurrences of misalignment present in today's models could compound as the models build their successors, growing more frequent but less understood until we lose control."
Amdahl's law for organizations#
A recurring brake across futures 2–3: speeding up one part of a process just shifts the bottleneck elsewhere; overall pace is capped by the parts that haven't sped up (Amdahl's law). Anthropic has already hit its signature: as more code flows through the org, human code review became the new bottleneck — the org-level instance of Verification as the New Bottleneck. The same friction appears beyond engineering: an explosion of ideas/initiatives/tools "far more than we have the capacity to pursue." Spotting and clearing these bottlenecks "may become the most important skill for any organization." This is also why "the felt pace of this future will still be set by the bottlenecks" — RSI can't run clinical trials faster than biology, hold elections sooner than constitutions allow, or turn a stranger into an old friend in a weekend.
What should we do? (the governance response)#
Anthropic argues it would "likely be a good thing" to have the option to slow or pause frontier development so societal structures and alignment research can keep up — but a unilateral pause merely changes who leads, and a real one requires multilateral, verifiable coordination. Building the systems that make a credible pause possible is the subject of Frontier Pause Verification and the Anthropic Institute's agenda. "The window to investigate the questions together is here, and people outside AI companies should be involved."
Connections#
- AI Accelerating AI Development — the measured, present-tense evidence half of the essay; the data behind "the loop is tightening"
- AI R&D Autonomy Evaluation (AECI) — the capability-side gate: AECI and the substitution threshold are how Anthropic measures "can the model build the next model?"
- Responsible Scaling Policy Evaluations — the deployment brake; the RSP AI-R&D threat model is RSI risk made operational
- Research Taste as the Human Bottleneck — the last human comparative advantage; whether it holds determines which of the three futures obtains
- Task Time-Horizon Scaling — the external trendline (METR doubling every ~4 months) that makes the extrapolation quantitative
- Frontier Pause Verification — the governance response: building the verification regime a credible slowdown would require
- The Bitter Lesson — "perspiration is automatable" is the bitter lesson applied to research itself; RSI is its furthest extrapolation
- Agentic Loops Overtake Bespoke Systems — RSI's clearest existing-domain proxy: a simple loop matched a bespoke trained system as the model improved
- Harness Shrinkage as Models Improve — the same human-role-narrowing dynamic; humans stop writing code and shift to review
- Verification as the New Bottleneck — Amdahl's law instantiated: review becomes the binding constraint as generation accelerates
- Agentic Misalignment (AM) — the failure mode that could compound through self-improvement: misalignment growing "more frequent but less understood"
- Jagged Intelligence (Ghosts, Not Animals) — the "taste is just another capability AI masters" argument rests on the joke/theory-of-mind precedent
- LLM-Driven Vulnerability Research — Glasswing is the essay's proof that even frozen capability reshapes the world
- Autonomous Scientific Discovery — June 2026 wet-lab evidence that "perspiration is becoming automated" reaches discovery itself (the futures-2/3 case): autonomous drug design, novel hypotheses, week-long genomics
- AI-Native Startup Lifecycle — the diffusion scenario: each employee atop a pyramid of agents; 100-person firms doing 1,000-person work
Open questions#
- Is "research taste" a true ceiling (future 1) or just the next capability to fall (futures 2–3)? The essay frames this as the single load-bearing uncertainty.
- The RSI extrapolation rests on trends staying exponential rather than S-curving — but the essay concedes it cannot rule out an architectural ceiling or a compute/energy supply-chain constraint. Which binds first?
- If misalignment compounds through self-improvement (future 3), is AECI-gated RSP review fast enough to catch it before control is lost?
Sources#
- When AI builds itself — Anthropic Institute, When AI builds itself: Our progress toward recursive self-improvement, and its implications (Marina Favaro & Jack Clark, June 2026)
Cited by 22
- Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
- Agentic Misalignment (AM)
Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…
- AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
- AI-Native Startup Lifecycle
Anthropic's May 2026 reframing of Idea/MVP/Launch/Scale assuming AI infrastructure: each stage's headcount/capital/skil…
- AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Anthropic Institute
Anthropic's policy/governance research arm; published *When AI builds itself* (Favaro & Clark, 2026) on recursive self-…
- Autonomous Scientific Discovery
Mythos-class models now conduct novel science with limited human input — autonomous protein/drug design (~10× faster, m…
- Build for the Next Model
Prototype the thing that almost works, not the thing that already works: bet that the next concrete model release (not…
- Compute Allocator
The human's evolving role: deciding what's worth spending compute on; ~1% of generated tokens ship, 99% is scaffolding…
- Frontier Pause Verification
The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
- LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
- METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
- Governance & Workforce
Map of Content for the governance-workforce domain — 11 concepts. Curated entry point; see Home for all domains.
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
- Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
- Task Time-Horizon Scaling
METR's measure of the task length AI can complete reliably on its own, doubling roughly every 4 months (up from every 7…
- The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…
- Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Related articles
- AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
- Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Research Taste as the Human Bottleneck
The narrowing human role as AI absorbs execution: choosing which problems matter, which results to trust, and when an a…
- AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
