← Back to glossary
Glossary

Tokenisation

Reviewed 20 March 2026 Canonical definition

Tokenisation is the process of breaking input text into tokens that a language model can process. Different tokenisers produce different token counts for the same text, which affects cost calculations, context window usage, and cross-model compatibility.