← Back to glossary
Glossary

Multi-Modal Agent

Reviewed 20 March 2026 Canonical definition

A multi-modal agent can process and generate multiple types of content — text, images, audio, video, or structured data — within a single workflow. Each modality introduces distinct governance, safety, and compliance considerations.