← Back to glossary
Glossary

Model Stealing

Reviewed 9 April 2026 Canonical definition

Model stealing is an attack where an adversary queries an AI model repeatedly to extract enough information to reconstruct a functional copy of the model's behaviour. In agent contexts, model stealing can expose proprietary fine-tuning investments, enable attackers to study the model for weaknesses, or violate intellectual property rights.