Elasticsearchquery~10 mins

Character filters in Elasticsearch - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Character filters

Input Text

↓

Apply Character Filters

↓

Modified Text

↓

Tokenizer

↓

Tokens for Analysis

Character filters take the input text and change characters before tokenizing, preparing text for analysis.

Execution Sample

Elasticsearch

{
  "settings": {
    "analysis": {
      "char_filter": {
        "my_filter": {
          "type": "mapping",
          "mappings": ["&=>and", "@=>at"]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "char_filter": ["my_filter"],
          "tokenizer": "whitespace"
        }
      }
    }
  }
}

This config replaces '&' with 'and' and '@' with 'at' before splitting text by spaces.

Execution Table

Step	Input Text	Character Filter Applied	Modified Text	Tokenizer Output
1	rock & roll @ night	Replace '&' with 'and'	rock and roll @ night	—
2	rock and roll @ night	Replace '@' with 'at'	rock and roll at night	—
3	rock and roll at night	No more filters	rock and roll at night	Tokens: ['rock', 'and', 'roll', 'at', 'night']
4	End	All filters applied	Final text ready for tokenizing	Tokenization complete

💡 All character filters applied; text is ready for tokenization.

Variable Tracker

Variable	Start	After Step 1	After Step 2	After Step 3	Final
input_text	rock & roll @ night	rock and roll @ night	rock and roll at night	rock and roll at night	rock and roll at night
modified_text	rock & roll @ night	rock and roll @ night	rock and roll at night	rock and roll at night	rock and roll at night
tokens	[]	[]	[]	['rock', 'and', 'roll', 'at', 'night']	['rock', 'and', 'roll', 'at', 'night']

Key Moments - 3 Insights

Why does the text change before tokenization?

Are character filters applied after tokenization?

What happens if no character filters are defined?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 2, what is the modified text after replacing '@'?

Arock & roll at night

Brock and roll @ night

Crock and roll at night

Drock & roll @ night

Concept Snapshot

Character filters modify input text before tokenizing.
They replace or remove characters.
Configured in analysis settings.
Run before tokenizer.
Help clean or normalize text.
Example: replace '&' with 'and'.

Full Transcript

Character filters in Elasticsearch change the input text before it is split into tokens. For example, they can replace symbols like '&' with words like 'and'. This happens before the tokenizer runs. The process starts with the original text, then each character filter applies its changes in order. After all filters run, the tokenizer splits the cleaned text into tokens. This helps search work better by normalizing text. If no character filters are used, the tokenizer works on the raw text. The example shows replacing '&' with 'and' and '@' with 'at', then splitting by spaces to get tokens.