← Back to glossary
Glossary

Value Alignment

Reviewed 9 April 2026 Canonical definition

Value alignment is the challenge of ensuring an AI agent's actions are consistent with the values and preferences of the humans it is meant to serve — not just technically correct but substantively beneficial. It is broader than goal specification and includes handling value uncertainty, preference learning, and conflicts between different stakeholders' values.