Elasticsearchquery~10 mins

Token filters (lowercase, stemmer, synonym) in Elasticsearch - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Token filters (lowercase, stemmer, synonym)

Input Text

↓

Tokenizer splits text

↓

Lowercase filter: convert to lowercase

↓

Stemmer filter: reduce words to root form

↓

Synonym filter: replace words with synonyms

↓

Output tokens for indexing/search

Text is split into tokens, then filters apply in order: lowercase makes all letters small, stemmer reduces words to roots, synonym replaces words with their synonyms.

Execution Sample

Elasticsearch

PUT /my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym": {
          "type": "synonym",
          "synonyms": ["quick,fast"]
        },
        "my_stemmer": {
          "type": "stemmer",
          "language": "english"
        },
        "my_lowercase": {
          "type": "lowercase"
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "filter": ["my_lowercase", "my_stemmer", "my_synonym"]
        }
      }
    }
  }
}

GET /my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "Quickly running fast runners"
}

This code creates an index with a custom analyzer that lowercases, stems, and applies synonyms to the input text, then analyzes the sample text.

Execution Table

Step	Input Token	Filter Applied	Output Token	Notes
1	Quickly	Lowercase	quickly	Convert to lowercase
2	quickly	Stemmer	quick	Stem to root form
3	quick	Synonym	quick, fast	Replace with synonyms (expands to two tokens)
4	running	Lowercase	running	Already lowercase
5	running	Stemmer	run	Stem to root form
6	run	Synonym	run	No synonym found
7	fast	Lowercase	fast	Already lowercase
8	fast	Stemmer	fast	Stemmer keeps 'fast'
9	fast	Synonym	quick, fast	Synonym expands to two tokens
10	runners	Lowercase	runners	Already lowercase
11	runners	Stemmer	runner	Stem to root form
12	runner	Synonym	runner	No synonym found
13	End	-	-	All tokens processed

💡 All tokens processed through lowercase, stemmer, and synonym filters.

Variable Tracker

Token	Original	After Lowercase	After Stemmer	After Synonym
Token1	Quickly	quickly	quick	quick, fast
Token2	running	running	run	run
Token3	fast	fast	fast	quick, fast
Token4	runners	runners	runner	runner

Key Moments - 3 Insights

Why does the token 'Quickly' become two tokens 'quick' and 'fast' after the synonym filter?

Does the stemmer always shorten words to their root form?

Why is the lowercase filter applied before the stemmer and synonym filters?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table, what is the output token after the stemmer filter for 'running'?

Arun

Brunning

Crunner

Dran

Concept Snapshot

Token filters process tokens after splitting text.
Lowercase filter makes all letters small.
Stemmer reduces words to their root form.
Synonym filter replaces words with synonyms, possibly expanding tokens.
Filters apply in order and affect search indexing and matching.

Full Transcript

This visual execution shows how Elasticsearch token filters work step-by-step. First, the input text is split into tokens. Then, each token passes through the lowercase filter, which converts all letters to lowercase. Next, the stemmer filter reduces words to their root forms, like 'running' to 'run'. Finally, the synonym filter replaces tokens with their synonyms, sometimes expanding one token into multiple tokens, such as 'quick' becoming 'quick' and 'fast'. The execution table traces each token through these filters, showing how tokens change at each step. The variable tracker summarizes token states after each filter. Key moments clarify common confusions, like why synonyms expand tokens and why lowercase is applied first. The quiz tests understanding by asking about specific steps and effects of filters. This process helps Elasticsearch index and search text more effectively by normalizing and expanding tokens.