0
0
Elasticsearchquery~5 mins

Token filters (lowercase, stemmer, synonym) in Elasticsearch - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What does the lowercase token filter do in Elasticsearch?
The lowercase token filter converts all characters in tokens to lowercase. This helps make searches case-insensitive, so 'Apple' and 'apple' are treated the same.
Click to reveal answer
beginner
Explain the purpose of a stemmer token filter.
A stemmer token filter reduces words to their root form. For example, 'running', 'runs', and 'ran' become 'run'. This helps match different forms of a word during search.
Click to reveal answer
beginner
What is a synonym token filter used for?
A synonym token filter replaces words with their synonyms during indexing or searching. For example, 'quick' might be replaced with 'fast'. This expands search results to include similar meanings.
Click to reveal answer
intermediate
How do token filters fit into the Elasticsearch analysis process?
Token filters process tokens after the text is split by the tokenizer. They modify tokens by changing case, stemming, or adding synonyms before indexing or searching.
Click to reveal answer
intermediate
Give an example of how to define a lowercase and stemmer token filter in Elasticsearch settings.
Example settings snippet:
{
  "analysis": {
    "filter": {
      "my_lowercase": { "type": "lowercase" },
      "my_stemmer": { "type": "stemmer", "language": "english" }
    }
  }
}
Click to reveal answer
What does the lowercase token filter do in Elasticsearch?
AConverts all tokens to lowercase
BRemoves stop words
CSplits text into tokens
DAdds synonyms to tokens
Which token filter reduces words to their root form?
AStop filter
BSynonym filter
CLowercase filter
DStemmer filter
What is the main use of a synonym token filter?
ATo convert tokens to lowercase
BTo remove punctuation
CTo replace words with their synonyms
DTo split text into tokens
In which part of the analysis process do token filters operate?
AAfter tokenizing text
BBefore tokenizing text
CDuring indexing only
DOnly during searching
Which of these is a correct way to define a lowercase filter in Elasticsearch settings?
A"filter": { "my_lowercase": { "type": "synonym" } }
B"filter": { "my_lowercase": { "type": "lowercase" } }
C"filter": { "my_lowercase": { "type": "stemmer" } }
D"filter": { "my_lowercase": { "type": "tokenizer" } }
Describe how lowercase, stemmer, and synonym token filters help improve search results in Elasticsearch.
Think about how each filter changes the words to match more queries.
You got /3 concepts.
    Explain the order of operations in Elasticsearch analysis involving tokenizers and token filters.
    Remember token filters work after tokenizing.
    You got /3 concepts.