← Back to glossary
Glossary

AI Alignment

Reviewed 20 March 2026 Canonical definition

AI alignment is the challenge of ensuring an AI system's goals and actions remain consistent with human intentions and organisational policies. For agents, misalignment can mean optimising for a metric in ways that violate safety or ethics constraints.