Glossary

Red Teaming (Agent)

Reviewed 9 April 2026 Canonical definition

Agent red teaming is the practice of having adversarial testers systematically probe an AI agent for vulnerabilities, policy bypasses, and unsafe behaviors before it reaches production. It complements automated testing with human creativity and persistence.

Red Teaming (Agent)

Related articles

Related terms