Challenge - 5 Problems

🎖️

Character Filter Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output of this character filter configuration?

Given the following Elasticsearch analyzer configuration using a mapping character filter, what will be the output tokens for the input text "foo-bar baz"?

Elasticsearch

{
  "analysis": {
    "char_filter": {
      "my_mapping": {
        "type": "mapping",
        "mappings": ["- => ""]"]
      }
    },
    "analyzer": {
      "my_analyzer": {
        "type": "custom",
        "char_filter": ["my_mapping"],
        "tokenizer": "whitespace"
      }
    }
  }
}

A["foobar", "baz"]

B["foo-bar", "baz"]

C["foo", "bar", "baz"]

D["foo", "-bar", "baz"]

Attempts:

2 left

🧠 Conceptual

intermediate

1:30remaining

Which character filter type removes HTML tags from input text?

In Elasticsearch, which character filter type is designed to remove HTML tags from the input text before tokenization?

Amapping

Bkeyword_marker

Chtml_strip

Dpattern_replace

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

What error does this character filter configuration cause?

Consider this Elasticsearch character filter configuration snippet. What error will Elasticsearch raise when trying to create this analyzer?

Elasticsearch

{
  "analysis": {
    "char_filter": {
      "bad_filter": {
        "type": "mapping",
        "mappings": "- >"
      }
    },
    "analyzer": {
      "test_analyzer": {
        "type": "custom",
        "char_filter": ["bad_filter"],
        "tokenizer": "standard"
      }
    }
  }
}

AIllegalArgumentException because 'mappings' must be an array

BElasticsearchParseException due to invalid mapping syntax

CNo error, analyzer created successfully

DNullPointerException during analysis

Attempts:

2 left

🚀 Application

advanced

1:30remaining

How many tokens are produced by this analyzer?

Given this analyzer configuration with a pattern_replace character filter that removes digits, how many tokens will be produced from the input "abc123 def456 ghi789"?

Elasticsearch

{
  "analysis": {
    "char_filter": {
      "remove_digits": {
        "type": "pattern_replace",
        "pattern": "\\d",
        "replacement": ""
      }
    },
    "analyzer": {
      "digit_remover": {
        "type": "custom",
        "char_filter": ["remove_digits"],
        "tokenizer": "whitespace"
      }
    }
  }
}

Attempts:

2 left

🔧 Debug

expert

2:30remaining

Why does this analyzer produce unexpected tokens?

An Elasticsearch analyzer uses this character filter configuration but produces tokens with underscores instead of spaces. What is the cause?

Elasticsearch

{
  "analysis": {
    "char_filter": {
      "underscore_to_space": {
        "type": "mapping",
        "mappings": ["_ => \\u0020"]
      }
    },
    "analyzer": {
      "custom_analyzer": {
        "type": "custom",
        "char_filter": ["underscore_to_space"],
        "tokenizer": "whitespace"
      }
    }
  }
}

AThe mapping syntax requires double backslashes for unicode escapes

BThe mapping character filter does not support unicode escapes

CThe mapping uses an incorrect unicode escape sequence for space

DThe tokenizer is ignoring the character filter

Attempts:

2 left