Introduction
Controlling vocabulary size helps models focus on important words and run faster by ignoring rare or unimportant words.
When building a text classifier and you want to reduce noise from rare words.
When training a language model and you need to limit memory use.
When preparing text data for chatbots to keep the model simple.
When working with limited computing power and want faster training.
When you want to improve model generalization by ignoring very rare words.