Concept Flow - Character filters
Input Text
Apply Character Filters
Modified Text
Tokenizer
Tokens for Analysis
Character filters take the input text and change characters before tokenizing, preparing text for analysis.
{
"settings": {
"analysis": {
"char_filter": {
"my_filter": {
"type": "mapping",
"mappings": ["&=>and", "@=>at"]
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": ["my_filter"],
"tokenizer": "whitespace"
}
}
}
}
}| Step | Input Text | Character Filter Applied | Modified Text | Tokenizer Output |
|---|---|---|---|---|
| 1 | rock & roll @ night | Replace '&' with 'and' | rock and roll @ night | — |
| 2 | rock and roll @ night | Replace '@' with 'at' | rock and roll at night | — |
| 3 | rock and roll at night | No more filters | rock and roll at night | Tokens: ['rock', 'and', 'roll', 'at', 'night'] |
| 4 | End | All filters applied | Final text ready for tokenizing | Tokenization complete |
| Variable | Start | After Step 1 | After Step 2 | After Step 3 | Final |
|---|---|---|---|---|---|
| input_text | rock & roll @ night | rock and roll @ night | rock and roll at night | rock and roll at night | rock and roll at night |
| modified_text | rock & roll @ night | rock and roll @ night | rock and roll at night | rock and roll at night | rock and roll at night |
| tokens | [] | [] | [] | ['rock', 'and', 'roll', 'at', 'night'] | ['rock', 'and', 'roll', 'at', 'night'] |
Character filters modify input text before tokenizing. They replace or remove characters. Configured in analysis settings. Run before tokenizer. Help clean or normalize text. Example: replace '&' with 'and'.