Challenge - 5 Problems

🎖️

Analyzer Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output tokens from this _analyze request?

Given the following Elasticsearch _analyze API request using the standard analyzer, what tokens will be produced?

Elasticsearch

{
  "analyzer": "standard",
  "text": "Quick Brown Foxes"
}

A["quick", "brown", "foxes"]

B["Quick", "Brown", "Foxes"]

C["quick brown foxes"]

D["quick", "brown", "fox"]

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

What tokens does the whitespace analyzer produce?

Using the _analyze API with the whitespace analyzer on the text "Hello, World! Welcome to Elasticsearch.", what tokens are returned?

Elasticsearch

{
  "analyzer": "whitespace",
  "text": "Hello, World! Welcome to Elasticsearch."
}

A["hello", "world", "welcome", "to", "elasticsearch"]

B["Hello", "World", "Welcome", "to", "Elasticsearch"]

C["Hello,", "World!", "Welcome", "to", "Elasticsearch."]

D["Hello,", "World", "Welcome", "to", "Elasticsearch"]

Attempts:

2 left

❓ Predict Output

advanced

2:30remaining

What is the output tokens with a custom analyzer using lowercase and stop filters?

Given this _analyze request with a custom analyzer that uses the standard tokenizer, lowercase filter, and stop filter with stopwords ["the", "is"], what tokens are produced for the text "The quick brown fox is fast"?

Elasticsearch

{
  "tokenizer": "standard",
  "filter": ["lowercase", "stop"],
  "text": "The quick brown fox is fast"
}

A["quick", "brown", "fox", "fast"]

B["the", "quick", "brown", "fox", "is", "fast"]

C["quick", "brown", "fox", "is", "fast"]

D["The", "quick", "brown", "fox", "fast"]

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

What error does this _analyze request produce?

What error will Elasticsearch return when running this _analyze request with an invalid tokenizer name?

Elasticsearch

{
  "tokenizer": "nonexistent_tokenizer",
  "text": "Test text"
}

A500 Internal Server Error

B400 Bad Request with error 'tokenizer [nonexistent_tokenizer] not found'

C200 OK with tokens ["test", "text"]

D404 Not Found

Attempts:

2 left

🧠 Conceptual

expert

3:00remaining

How many tokens are produced by this _analyze request with a pattern tokenizer?

Using the _analyze API with a pattern tokenizer that splits on commas and spaces (pattern: '[,\s]+') on the text "apple, banana orange,pear", how many tokens are produced?

Elasticsearch

{
  "tokenizer": {
    "type": "pattern",
    "pattern": "[,\s]+"
  },
  "text": "apple, banana orange,pear"
}

Attempts:

2 left