Glossary

Breakout Attack (Agent)

Reviewed 9 April 2026 Canonical definition

A breakout attack occurs when an AI agent escapes the boundaries of its intended execution environment — accessing systems, data, or capabilities outside its defined scope. In sandboxed deployments, breakout exploits vulnerabilities in the sandbox itself. In policy-governed systems, it exploits gaps or ambiguities in the policy rules. Breakout is the primary threat that runtime containment strategies are designed to prevent.

Breakout Attack (Agent)

Related articles

Related terms