← Back to glossary
Glossary

Threat Modelling (AI Agent)

Reviewed 9 April 2026 Canonical definition

Threat modelling for AI agents is the structured analysis of how an agent system could be attacked — identifying the assets worth protecting, the potential attackers and their capabilities, the attack vectors available to them, and the controls that mitigate each threat. A threat model for an agent typically covers prompt injection, tool misuse, identity spoofing, data exfiltration, and supply chain compromise, and drives the security requirements for agent governance.