← All guides
Checklist

AI Agent Security Checklist

12 controls to verify before deploying AI agents to production.

Updated 20 March 2026 12 items 5 categories
0 of 12 complete

AI agents introduce attack surfaces that traditional application security does not cover — prompt injection, tool poisoning, credential misuse, and data exfiltration through autonomous tool calls. This checklist covers the essential security controls every team should verify before an agent reaches production.

Identity & Access

Every agent has a unique, scoped identity

No shared service accounts. Each agent is registered with its own credentials, bound to specific permissions, and attributable in audit logs.

Credentials are short-lived and automatically rotated

Agents use time-limited tokens rather than static API keys. Rotation is automated and credential lifecycle events are logged.

Least privilege is enforced per agent

Each agent can only access the tools, data, and APIs it needs for its specific task. Broad permissions are replaced with granular, scoped access.

Input Protection

Input validation is applied to all agent inputs

User messages and retrieved content are screened for injection attempts, malicious payloads, and malformed data before reaching the model.

Indirect prompt injection defences are in place

Content from external sources — documents, emails, web pages, database results — is treated as untrusted and inspected before being included in the agent's context.

Output Protection

Output filtering catches sensitive data leakage

Agent responses are scanned for PII, credentials, internal system details, and other sensitive data before being returned to users or passed to downstream systems.

Content safety guardrails are active

Guardrails check for harmful, misleading, or non-compliant content in agent outputs. Violations are blocked, logged, and trigger alerts.

Tool Security

Every tool is reviewed and approved before use

MCP servers, API integrations, and custom tools go through a security review that evaluates data access, side effects, and failure modes before agents can invoke them.

Tool calls are logged with full parameters and responses

Every tool invocation is recorded — including what was called, with what arguments, and what was returned. This enables forensic investigation and compliance evidence.

Monitoring & Response

Anomaly detection monitors agent behavior

Baselines are established for normal agent behavior — tool call patterns, token usage, error rates — and deviations trigger automated alerts.

Kill switches are accessible and tested

Security teams can immediately suspend any agent. Kill switches are documented, accessible, and regularly tested to confirm they work under pressure.

An incident response playbook exists for agent compromises

The team has documented steps for detecting, containing, investigating, and recovering from agent security incidents — including communication plans and post-incident review.

See how Prefactor enforces agent security controls

Prefactor gives enterprises runtime governance, observability, and control over every AI agent in production.

Book a demo →