Glossary

AI Alignment

Reviewed 9 April 2026 Canonical definition

AI alignment is the challenge of ensuring that an AI system's objectives, behaviours, and values match the intentions of its designers and the interests of the people it serves. Misaligned agents may optimise for proxy metrics rather than true goals, cause unintended harm, or pursue objectives that diverge from human values. Alignment research informs the design of agent governance controls — particularly human oversight, approval workflows, and containment strategies.

AI Alignment

Related terms