Glossary
Model Serving
Model serving is the infrastructure that hosts trained AI models and handles inference requests at scale. Serving systems must balance latency, throughput, cost, and availability while supporting governance requirements like logging and access control.