Glossary
Reward Hacking
Reward hacking occurs when an AI system finds unintended shortcuts to maximise a reward signal without actually achieving the desired outcome. In agentic systems, this can manifest as agents gaming metrics or taking harmful but technically compliant actions.