0
0
Elasticsearchquery~5 mins

Custom analyzers in Elasticsearch - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is a custom analyzer in Elasticsearch?
A custom analyzer is a user-defined text processor that controls how text is broken down and transformed during indexing and searching. It combines tokenizer, filters, and character filters to tailor text analysis.
Click to reveal answer
beginner
Name the three main components of a custom analyzer.
The three main components are:<br>1. Character filters (modify text before tokenizing)<br>2. Tokenizer (splits text into tokens)<br>3. Token filters (modify tokens after tokenizing)
Click to reveal answer
intermediate
How do you define a custom analyzer in Elasticsearch settings?
You define it inside the index settings under the 'analysis' section, specifying the analyzer name and its components like tokenizer and filters. Example:
{"analysis": {"analyzer": {"my_analyzer": {"type": "custom", "tokenizer": "standard", "filter": ["lowercase", "asciifolding"]}}}}
Click to reveal answer
beginner
Why use the 'asciifolding' filter in a custom analyzer?
The 'asciifolding' filter converts accented characters to their ASCII equivalents (e.g., 'é' to 'e'), helping match text regardless of accents.
Click to reveal answer
intermediate
What happens if you omit token filters in a custom analyzer?
If you omit token filters, the analyzer will only tokenize the text without modifying tokens further. This might miss opportunities to normalize or improve search matching.
Click to reveal answer
Which component of a custom analyzer splits text into words?
ATokenizer
BCharacter filter
CToken filter
DIndex filter
Where do you define a custom analyzer in Elasticsearch?
AIn the index settings under 'analysis'
BIn the document mapping
CIn the search query
DIn the cluster settings
What does the 'lowercase' token filter do?
ARemoves accents
BRemoves stop words
CSplits text into tokens
DConverts tokens to lowercase
Which filter would help match 'café' and 'cafe'?
Alowercase
Bstop
Casciifolding
Dsynonym
If you want to remove common words like 'the' or 'and', which filter should you use?
Aasciifolding filter
Bstop filter
Clowercase filter
Dkeyword marker filter
Explain how to create a custom analyzer in Elasticsearch and why you might want to use one.
Think about how text is processed before searching.
You got /3 concepts.
    Describe the role of token filters in a custom analyzer and give two examples.
    Filters help clean or change words for better matching.
    You got /2 concepts.