Challenge - 5 Problems

🎖️

Elasticsearch Analyzer Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output of this Elasticsearch analyzer test?

Given the following analyzer configuration and input text, what is the list of tokens produced?

Elasticsearch

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "stop"]
        }
      },
      "filter": {
        "stop": {
          "type": "stop",
          "stopwords": ["the", "is"]
        }
      }
    }
  }
}

Input text: "The quick brown fox is jumping"

A["quick", "brown", "fox", "is", "jumping"]

B["The", "quick", "brown", "fox", "jumping"]

C["quick", "brown", "fox", "jumping"]

D["the", "quick", "brown", "fox", "is", "jumping"]

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

What tokens result from this custom analyzer with a pattern tokenizer?

Given this analyzer configuration and input, what tokens are produced?

Elasticsearch

{
  "settings": {
    "analysis": {
      "analyzer": {
        "pattern_analyzer": {
          "tokenizer": "pattern",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "pattern": {
          "type": "pattern",
          "pattern": "\\W+"
        }
      }
    }
  }
}

Input text: "Hello, World! Welcome to Elasticsearch."

A["hello", "world", "welcome", "to", "elasticsearch", ""]

B["hello", "world", "welcome", "to", "elasticsearch"]

C["Hello", "World", "Welcome", "to", "Elasticsearch", ""]

D["Hello", "World", "Welcome", "to", "Elasticsearch"]

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Why does this analyzer configuration cause an error?

This analyzer configuration causes an error when indexing. What is the cause?

Elasticsearch

{
  "settings": {
    "analysis": {
      "analyzer": {
        "bad_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "nonexistent_filter"]
        }
      }
    }
  }
}

AThe filter "nonexistent_filter" is not defined, causing a configuration error.

BThe standard tokenizer is deprecated and causes errors.

CThe lowercase filter must be defined explicitly in the filter section.

DThe analyzer must have at least two tokenizers defined.

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

What is the output tokens of this analyzer with a synonym filter?

Given this analyzer configuration and input, what tokens are produced?

Elasticsearch

{
  "settings": {
    "analysis": {
      "analyzer": {
        "synonym_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "my_synonym"]
        }
      },
      "filter": {
        "my_synonym": {
          "type": "synonym",
          "synonyms": ["quick,fast"]
        }
      }
    }
  }
}

Input text: "The quick fox"

A["the", "quick", "fast", "fox"]

B["the", "quick", "fox"]

C["the", "fast", "fox"]

D["quick", "fast", "fox"]

Attempts:

2 left

🧠 Conceptual

expert

3:00remaining

Which filter order produces this token output?

You want to produce tokens from the input "Running runs run" that are stemmed and lowercase, but stopwords are removed after stemming. Which filter order in the analyzer produces the tokens ["run", "run", "run"]?

Afilter: ["lowercase", "stop", "stemmer"]

Bfilter: ["stemmer", "lowercase", "stop"]

Cfilter: ["stop", "lowercase", "stemmer"]

Dfilter: ["lowercase", "stemmer", "stop"]

Attempts:

2 left