Glossary

Value Alignment

Reviewed 9 April 2026 Canonical definition

Value alignment is the challenge of ensuring an AI agent's actions are consistent with the values and preferences of the humans it is meant to serve — not just technically correct but substantively beneficial. It is broader than goal specification and includes handling value uncertainty, preference learning, and conflicts between different stakeholders' values.

Value Alignment

Related articles

Related terms