H
Howardism
Plate IIEntities機器翻譯 · machine-translatedENHOWARDISM

Anthropic Institute

PublishedJune 7, 2026FiledEntityDomainEntitiesTagsEntityOrgAI PolicyGovernanceAnthropicReading3 minSourceAI-synthesised

Anthropic 的政策與治理研究部門;發表了關於 Recursive Self-Improvement 的 *When AI builds itself* (Favaro & Clark, 2026);其議程包括建立可信的多邊 AI 減速所需要的驗證系統

Anthropic Institute 的插圖

資料來源#

摘要#

Anthropic Institute 是 Anthropic 的研究與政策部門,專注於 frontier AI 對社會與治理的影響。它發表了 When AI builds itself(2026 年 6 月)——這是本 Wiki 關於 Recursive Self-Improvement 的主要來源——並擁有一項 公開議程,旨在與他人合作建立可信的 AI 減速或暫停所需的系統(Frontier Pause Verification)。

主要工作#

  • 面向大眾的軌跡分析。 When AI builds itself 結合了公開基準測試(Task Time-Horizon Scaling)與先前未公開的 Anthropic 內部數據(AI Accelerating AI Development),以論證 AI 已經在加速 AI 的研發,並為 RSI 描繪了三種未來。
  • 協調基礎設施。 它計劃「與許多人合作進行研究並採取行動,以協助建立可信的減速或暫停所需要的系統」:驗證其他開發者是否確實停止,以及確保惡意行為者無法利用協同減速在暗中超越(Frontier Pause Verification)。
  • 召集。 在文章發表後的幾個月內,該機構計劃組織決策者、研究人員、公民社會以及其他 AI 公司之間的對話,並公布其結果——明確邀請 AI 公司以外的聲音參與討論。

相關人員#

  • Marina FavaroJack Clark 共同撰寫了 When AI builds itself(由 Santi Ruiz 提供編輯支持;視覺設計由 Shan Carter、Romello Goodman、Nikki Makagiansar 製作,數據源自 Brian Calvert 與 Jun Shern Chan)。

相關連結#

未決問題#

  • 該機構的政策姿態(傾向保留暫停的選擇權)如何與 Anthropic 交付 frontier models 的商業動機相互作用?該文章承認了競爭與地緣政治的壓力,但並未解決此問題。
  • 該機構將原型化哪些具體的驗證機制?相對於其警告的 RSI 趨勢,其時間表為何?

資料來源#

  • When AI builds itself —— Anthropic Institute, When AI builds itself (Marina Favaro & Jack Clark, 2026 年 6 月)
§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 10
  • AI Accelerating AI Development

    The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…

  • AI R&D Autonomy Evaluation (AECI)

    How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…

  • Anthropic

    AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…

  • Frontier Pause Verification

    The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…

  • LLM-Driven Vulnerability Research

    Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…

  • METR

    Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…

  • Entities — People, Orgs, Tools & Projects

    Map of Content for all 32 entity pages. See Home for concept domains.

  • Mythos Model

    Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…

  • Open Questions Backlog

    _96 pages with open questions, as of 2026-06-14._

  • Recursive Self-Improvement

    An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…

Related articles
  • AI Accelerating AI Development

    The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…

  • Recursive Self-Improvement

    An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…

  • Mythos Model

    Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…

  • Responsible Scaling Policy Evaluations

    Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…

  • Claude Opus 4.8

    Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…