AI Agent Security Checklist
12 controls to verify before deploying AI agents to production.
AI agents introduce attack surfaces that traditional application security does not cover — prompt injection, tool poisoning, credential misuse, and data exfiltration through autonomous tool calls. This checklist covers the essential security controls every team should verify before an agent reaches production.
Every agent has a unique, scoped identity
No shared service accounts. Each agent is registered with its own credentials, bound to specific permissions, and attributable in audit logs.
Credentials are short-lived and automatically rotated
Agents use time-limited tokens rather than static API keys. Rotation is automated and credential lifecycle events are logged.
Least privilege is enforced per agent
Each agent can only access the tools, data, and APIs it needs for its specific task. Broad permissions are replaced with granular, scoped access.
Input validation is applied to all agent inputs
User messages and retrieved content are screened for injection attempts, malicious payloads, and malformed data before reaching the model.
Indirect prompt injection defences are in place
Content from external sources — documents, emails, web pages, database results — is treated as untrusted and inspected before being included in the agent's context.
Output filtering catches sensitive data leakage
Agent responses are scanned for PII, credentials, internal system details, and other sensitive data before being returned to users or passed to downstream systems.
Content safety guardrails are active
Guardrails check for harmful, misleading, or non-compliant content in agent outputs. Violations are blocked, logged, and trigger alerts.
Every tool is reviewed and approved before use
MCP servers, API integrations, and custom tools go through a security review that evaluates data access, side effects, and failure modes before agents can invoke them.
Tool calls are logged with full parameters and responses
Every tool invocation is recorded — including what was called, with what arguments, and what was returned. This enables forensic investigation and compliance evidence.
Anomaly detection monitors agent behavior
Baselines are established for normal agent behavior — tool call patterns, token usage, error rates — and deviations trigger automated alerts.
Kill switches are accessible and tested
Security teams can immediately suspend any agent. Kill switches are documented, accessible, and regularly tested to confirm they work under pressure.
An incident response playbook exists for agent compromises
The team has documented steps for detecting, containing, investigating, and recovering from agent security incidents — including communication plans and post-incident review.
See how Prefactor enforces agent security controls
Prefactor gives enterprises runtime governance, observability, and control over every AI agent in production.
Book a demo →