← Back to glossary
Glossary

Red Teaming (Agent)

Reviewed 20 March 2026 Canonical definition

Agent red teaming is the practice of having adversarial testers systematically probe an AI agent for vulnerabilities, policy bypasses, and unsafe behaviors before it reaches production. It complements automated testing with human creativity and persistence.