0
0
Elasticsearchquery~20 mins

Testing analyzers (_analyze API) in Elasticsearch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Analyzer Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output tokens from this _analyze request?
Given the following Elasticsearch _analyze API request using the standard analyzer, what tokens will be produced?
Elasticsearch
{
  "analyzer": "standard",
  "text": "Quick Brown Foxes"
}
A["quick", "brown", "foxes"]
B["Quick", "Brown", "Foxes"]
C["quick brown foxes"]
D["quick", "brown", "fox"]
Attempts:
2 left
💡 Hint
The standard analyzer lowercases and splits on whitespace and punctuation.
Predict Output
intermediate
2:00remaining
What tokens does the whitespace analyzer produce?
Using the _analyze API with the whitespace analyzer on the text "Hello, World! Welcome to Elasticsearch.", what tokens are returned?
Elasticsearch
{
  "analyzer": "whitespace",
  "text": "Hello, World! Welcome to Elasticsearch."
}
A["hello", "world", "welcome", "to", "elasticsearch"]
B["Hello", "World", "Welcome", "to", "Elasticsearch"]
C["Hello,", "World!", "Welcome", "to", "Elasticsearch."]
D["Hello,", "World", "Welcome", "to", "Elasticsearch"]
Attempts:
2 left
💡 Hint
Whitespace analyzer splits only on whitespace and does not lowercase or remove punctuation.
Predict Output
advanced
2:30remaining
What is the output tokens with a custom analyzer using lowercase and stop filters?
Given this _analyze request with a custom analyzer that uses the standard tokenizer, lowercase filter, and stop filter with stopwords ["the", "is"], what tokens are produced for the text "The quick brown fox is fast"?
Elasticsearch
{
  "tokenizer": "standard",
  "filter": ["lowercase", "stop"],
  "text": "The quick brown fox is fast"
}
A["quick", "brown", "fox", "fast"]
B["the", "quick", "brown", "fox", "is", "fast"]
C["quick", "brown", "fox", "is", "fast"]
D["The", "quick", "brown", "fox", "fast"]
Attempts:
2 left
💡 Hint
Stop filter removes common stopwords after lowercasing.
Predict Output
advanced
2:00remaining
What error does this _analyze request produce?
What error will Elasticsearch return when running this _analyze request with an invalid tokenizer name?
Elasticsearch
{
  "tokenizer": "nonexistent_tokenizer",
  "text": "Test text"
}
A500 Internal Server Error
B400 Bad Request with error 'tokenizer [nonexistent_tokenizer] not found'
C200 OK with tokens ["test", "text"]
D404 Not Found
Attempts:
2 left
💡 Hint
Elasticsearch returns a 400 error when a requested tokenizer is missing.
🧠 Conceptual
expert
3:00remaining
How many tokens are produced by this _analyze request with a pattern tokenizer?
Using the _analyze API with a pattern tokenizer that splits on commas and spaces (pattern: '[,\s]+') on the text "apple, banana orange,pear", how many tokens are produced?
Elasticsearch
{
  "tokenizer": {
    "type": "pattern",
    "pattern": "[,\s]+"
  },
  "text": "apple, banana orange,pear"
}
A2
B3
C5
D4
Attempts:
2 left
💡 Hint
Count tokens split by commas or spaces.