Glossary

Training Data Poisoning

Reviewed 9 April 2026 Canonical definition

Training data poisoning is an attack where an adversary corrupts some of the data used to train or fine-tune an AI model, causing the model to develop specific biases, backdoors, or vulnerabilities. It is a supply chain risk for agents built on custom fine-tuned models and for models that learn from continuously collected feedback.

Training Data Poisoning

Related articles

Related terms