← Back to glossary
Glossary

Agent Evaluation

Reviewed 20 March 2026 Canonical definition

Agent evaluation is the systematic assessment of an agent's quality, accuracy, safety, and policy compliance across a representative set of tasks. It should be automated, repeatable, and run before every deployment.