資料來源#
摘要#
Blast radius 衡量的是當 agent 發生故障或失控時可能造成的潛在損害。一個僅具有 read-only-to-one-database 權限的 agent,其 blast radius 很小;而對 cloud infrastructure 擁有 administrative access 權限的 agent,其 blast radius 則極為龐大。在 Zero Trust for AI Agents 中,它是 "assume breach" 原則的核心單元:安全投資應與暴露風險相匹配,而 design-for-breach 的安全防禦態勢意味著必須假設 每一個 agent 的 blast radius 最終都將面臨考驗。
Why it's the right unit#
Zero Trust 並不承諾防止系統受損,而是承諾將其 contain。Blast radius 將安全問題從「我們能否將攻擊者拒之門外?」(在 AI-Accelerated Offense 下這是一場注定失敗的邊界防禦遊戲)重構為「當 agent 被入侵時,它能存取多少資源?」。該框架中的所有其他控制項——Least Agency、identity、isolation——最終都是為了縮減這個數字而存在。
Containment mechanisms (resource boundaries)#
該框架最主要的 blast-radius 控制項是 identity-based isolation,而非 network segmentation:
- Identity-based isolation (Foundation) — 每個 agent workload 都攜帶其專屬的 cryptographic identity,且每個服務 僅接受來自明確命名之呼叫者(explicitly named callers) 的連線。Network segmentation 僅是 後盾,而非主要邊界——如果服務接受來自該網路的任何呼叫者,那麼到達 segment boundary 的攻擊者將會以此進行橫向移動。「在接收端強制執行 isolation。」
- Sandboxed execution (Enterprise) — 限制權限的容器、用於 syscall filtering 的 gVisor 等 runtime,以及受限的 mounts/network。對於任何處理 untrusted input(網頁內容、文件)的 agent,這被視為 mandatory, not aspirational。
- Hardware isolation (Advanced) — AMD SEV / Intel TDX、microVMs、attestation;甚至連 host OS 都無法檢查或篡改 workload。
輔助的 credential-side containment(Agent Identity and Authentication):per-agent credentials 與 credential isolation 意味著單一被竊取的 secret 並不會賦予攻擊者取得所有共享該祕密的 agent 的合併存取權限。
Compartmentalization as deliberate design#
工作流的 Phase 3 將 blast-radius assessment 作為一個明確的步驟:在定義了 approved actions、prohibited actions、escalation triggers 以及 scope limits 之後,識別出如果 agent 被入侵(compromised)時可能會發生什麼問題。 該框架建議將單個 agent 的功能拆分為多個具有不同 identities 的 agent,這樣攻擊者就必須入侵更多 agent 才能存取更多資源——但這只有在每個 agent 都獲得 unique credentials 的情況下才有效(shared credentials 會破壞 compartmentalization 的效果)。
The impossible-vs-tedious link#
Blast-radius assessment 必須通過 Impossible, Not Tedious (Design Test) 的檢驗:「如果您的 containment plan 依賴阻力(friction)——例如攻擊者必須發送大量請求,或繞過數個 rate limits——請假設它將會失敗。」 一個僅僅是讓 traverse 變得不便的 blast radius 並不算被成功 containment;如果 residual risk 無法被接受,請收緊 controls 直到 traversal 變得 impossible,而不僅僅是 tedious。
相關連結#
- Zero Trust for AI Agents — blast radius 是 "assume breach" 原則所 containment 的單元 (hub)
- Least Agency — 輸入控制項;constraining agency 是縮小 blast radius 的方法
- Agent Identity and Authentication — identity-based isolation 與 per-agent credentials 是主要的 blast-radius controls
- Impossible, Not Tedious (Design Test) — containment plan 必須通過的測試:impossible traversal,而不僅僅是 tedious
- Claude Code Best Practices — sandboxed execution + write-access restrictions 作為參考 containment 實作
- Autonomous Defense — 相同的 blast-radius containment 被向內應用於防禦性(Agentic SOAR)agent,這些 agent 本身也是高價值目標
- Foundation → Enterprise → Advanced: Is the Agent Access-Control Jump a Cliff? — staged migration(identity-first、接著 agency、最後 containment)以及 identity-based isolation → sandboxing → hardware isolation 在 tier ladder 中所處的位置
待解決的問題#
- 該框架傾向於使用 identity-based isolation 而非 network segmentation,但大多數企業在 segmentation 方面已有大量投入。其 migration path 為何,且 dual-running 是否會產生新的漏洞?
- Multi-agent compartmentalization 增加了需要管理的 identities 數量;在什麼情況下,identity-management 的 overhead 會產生其自身的 attack surface?
資料來源#
- Zero Trust for AI Agents — blast radius 在 Part I 中定義;resource boundaries 在 Part III 中定義;Phase 3 blast-radius assessment 在 Part IV 中定義
Cited by 9
- Foundation → Enterprise → Advanced: Is the Agent Access-Control Jump a Cliff?
No cliff — Enterprise (ABAC + dynamic privilege elevation with return-to-baseline + mTLS + sandboxing) is the pragmatic…
- Agent Identity and Authentication
The foundation control for agentic Zero Trust: cryptographically-rooted per-agent identity (→X.509→hardware attestation…
- Autonomous Defense
Running security operations at the speed of AI-accelerated threats: put a model at the front of the alert queue, automa…
- Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
- Impossible, Not Tedious (Design Test)
Zero Trust design test for agentic security: does a control make the attack impossible, or just tedious? Friction-only…
- Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
- AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…
Related articles
- Least Agency
OWASP term extending least privilege to agents: constrain not just what an agent can access but what each tool can do,…
- Zero Trust for AI Agents
Anthropic's security framework for deploying autonomous agents: trust nothing / verify everything / assume breach, appl…
- Agent Identity and Authentication
The foundation control for agentic Zero Trust: cryptographically-rooted per-agent identity (→X.509→hardware attestation…
- Agentic Prompt Injection
Direct and indirect injection of malicious instructions into an agent; LLMs cannot reliably distinguish information fro…
- Autonomous Defense
Running security operations at the speed of AI-accelerated threats: put a model at the front of the alert queue, automa…
