Glossary

Model Vulnerability

Reviewed 9 April 2026 Canonical definition

A model vulnerability is a weakness in a language model that can be exploited to produce harmful, incorrect, or non-compliant outputs. Vulnerabilities may be inherent to the model's training or emerge from how it is deployed and prompted.

Model Vulnerability

Related articles

Related terms