Glossary

LLM-as-Judge

Reviewed 9 April 2026 Canonical definition

LLM-as-judge is an evaluation technique where a separate language model — typically a capable model like GPT-4 or Claude — is used to score the outputs of an agent being evaluated. It enables scalable, automated assessment of subjective output qualities such as coherence, completeness, and tone that would otherwise require human annotators.

LLM-as-Judge

Related terms