Challenge - 5 Problems
Token Filter Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
What is the output of this Elasticsearch analyzer test?
Given the following analyzer configuration and input text, what tokens will be produced after applying the lowercase and stemmer filters?
Elasticsearch
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "english_stemmer"]
}
},
"filter": {
"english_stemmer": {
"type": "stemmer",
"language": "english"
}
}
}
}
}
Input text: "Running runners run quickly"Attempts:
2 left
💡 Hint
Remember that the lowercase filter converts all tokens to lowercase before stemming. The stemmer reduces words to their root form.
✗ Incorrect
The lowercase filter converts all tokens to lowercase: 'Running' -> 'running', 'runners' -> 'runners', etc. The English stemmer then reduces 'running' and 'runners' to 'run', and 'quickly' to 'quickli'. So the final tokens are ['run', 'run', 'run', 'quickli'].
🧠 Conceptual
intermediate1:00remaining
Which token filter is responsible for replacing words with their synonyms?
In Elasticsearch, which token filter should you use to replace tokens with their synonyms during analysis?
Attempts:
2 left
💡 Hint
Think about the filter that changes words to other words with similar meaning.
✗ Incorrect
The synonym token filter replaces tokens with their synonyms based on a defined synonym list. Lowercase converts tokens to lowercase, stemmer reduces words to their root form, and stop removes common words.
🔧 Debug
advanced2:00remaining
Why does this synonym filter configuration cause an error?
Examine the following synonym filter configuration. Why does Elasticsearch reject it with a syntax error?
Elasticsearch
{
"filter": {
"my_synonym": {
"type": "synonym",
"synonyms": [
"quick, fast",
"jumps, leap"
]
}
}
}Attempts:
2 left
💡 Hint
Check the latest Elasticsearch documentation for synonym filter types.
✗ Incorrect
In recent Elasticsearch versions, the 'synonym' filter type is deprecated and replaced by 'synonym_graph' for better support in multi-term synonyms. Using 'synonym' causes a syntax error.
📝 Syntax
advanced1:30remaining
Which option correctly defines a lowercase token filter in Elasticsearch?
Select the correct JSON snippet that defines a lowercase token filter named 'my_lowercase'.
Attempts:
2 left
💡 Hint
Check the exact spelling and case sensitivity of the 'type' property.
✗ Incorrect
The correct key is 'filter' (not 'filters'), and the type must be exactly 'lowercase' in lowercase letters without underscores. Option D matches this exactly.
🚀 Application
expert2:30remaining
How many tokens are produced by this analyzer with synonym and lowercase filters?
Given this analyzer configuration and input, how many tokens will be produced?
Elasticsearch
{
"settings": {
"analysis": {
"analyzer": {
"syn_lower_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "my_synonym"]
}
},
"filter": {
"my_synonym": {
"type": "synonym_graph",
"synonyms": [
"quick, fast",
"jumps, leaps"
]
}
}
}
}
}
Input text: "The quick fox jumps"Attempts:
2 left
💡 Hint
Remember that synonym_graph can produce multiple tokens for synonyms, increasing token count.
✗ Incorrect
The input tokens after standard tokenizer and lowercase are: ['the', 'quick', 'fox', 'jumps']. The synonym_graph filter expands 'quick' to ['quick', 'fast'] and 'jumps' to ['jumps', 'leaps'], so tokens become: ['the', 'quick', 'fast', 'fox', 'jumps', 'leaps'], total 6 tokens.