Recall & Review
beginner
What is a custom analyzer in Elasticsearch?
A custom analyzer is a user-defined text processor that controls how text is broken down and transformed during indexing and searching. It combines tokenizer, filters, and character filters to tailor text analysis.
Click to reveal answer
beginner
Name the three main components of a custom analyzer.
The three main components are:<br>1. Character filters (modify text before tokenizing)<br>2. Tokenizer (splits text into tokens)<br>3. Token filters (modify tokens after tokenizing)
Click to reveal answer
intermediate
How do you define a custom analyzer in Elasticsearch settings?
You define it inside the index settings under the 'analysis' section, specifying the analyzer name and its components like tokenizer and filters. Example:
{"analysis": {"analyzer": {"my_analyzer": {"type": "custom", "tokenizer": "standard", "filter": ["lowercase", "asciifolding"]}}}}Click to reveal answer
beginner
Why use the 'asciifolding' filter in a custom analyzer?
The 'asciifolding' filter converts accented characters to their ASCII equivalents (e.g., 'é' to 'e'), helping match text regardless of accents.
Click to reveal answer
intermediate
What happens if you omit token filters in a custom analyzer?
If you omit token filters, the analyzer will only tokenize the text without modifying tokens further. This might miss opportunities to normalize or improve search matching.
Click to reveal answer
Which component of a custom analyzer splits text into words?
✗ Incorrect
The tokenizer breaks the text into tokens or words.
Where do you define a custom analyzer in Elasticsearch?
✗ Incorrect
Custom analyzers are defined in the index settings inside the 'analysis' section.
What does the 'lowercase' token filter do?
✗ Incorrect
The 'lowercase' filter converts all tokens to lowercase to make searches case-insensitive.
Which filter would help match 'café' and 'cafe'?
✗ Incorrect
'asciifolding' converts accented characters to ASCII equivalents, matching 'café' to 'cafe'.
If you want to remove common words like 'the' or 'and', which filter should you use?
✗ Incorrect
The stop filter removes common stop words to improve search relevance.
Explain how to create a custom analyzer in Elasticsearch and why you might want to use one.
Think about how text is processed before searching.
You got /3 concepts.
Describe the role of token filters in a custom analyzer and give two examples.
Filters help clean or change words for better matching.
You got /2 concepts.