Glossary
Training Data Poisoning
Training data poisoning is an attack where an adversary corrupts some of the data used to train or fine-tune an AI model, causing the model to develop specific biases, backdoors, or vulnerabilities. It is a supply chain risk for agents built on custom fine-tuned models and for models that learn from continuously collected feedback.