← Back to glossary
Glossary

Human Evaluation

Reviewed 20 March 2026 Canonical definition

Human evaluation is the process of having people assess an AI agent's outputs for quality, accuracy, helpfulness, and safety. It captures nuances that automated metrics miss and is essential for validating agents that handle subjective or high-stakes tasks.