← Back to glossary
Glossary

Automated Evaluation

Reviewed 20 March 2026 Canonical definition

Automated evaluation uses programmatic checks, model-based judges, or statistical metrics to assess agent performance at scale. It enables continuous testing in CI/CD pipelines but should be supplemented with human review for nuanced quality.