0
0
Elasticsearchquery~10 mins

Tokenizers (standard, whitespace, pattern) in Elasticsearch - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to define a standard tokenizer in Elasticsearch.

Elasticsearch
{
  "settings": {
    "analysis": {
      "tokenizer": {
        "my_tokenizer": {
          "type": "[1]"
        }
      }
    }
  }
}
Drag options to blanks, or click blank then click option'
Akeyword
Bstandard
Cpattern
Dwhitespace
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'whitespace' instead of 'standard' when you want word boundaries.
Confusing 'pattern' tokenizer with 'standard' tokenizer.
2fill in blank
medium

Complete the code to define a whitespace tokenizer in Elasticsearch.

Elasticsearch
{
  "settings": {
    "analysis": {
      "tokenizer": {
        "my_ws_tokenizer": {
          "type": "[1]"
        }
      }
    }
  }
}
Drag options to blanks, or click blank then click option'
Awhitespace
Bpattern
Cstandard
Dkeyword
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'standard' tokenizer when whitespace tokenizer is needed.
Confusing 'pattern' tokenizer with 'whitespace' tokenizer.
3fill in blank
hard

Fix the error in the pattern tokenizer definition by completing the pattern field.

Elasticsearch
{
  "settings": {
    "analysis": {
      "tokenizer": {
        "my_pattern_tokenizer": {
          "type": "pattern",
          "pattern": "[1]"
        }
      }
    }
  }
}
Drag options to blanks, or click blank then click option'
A\\d+
B\\w+
C\\s+
D\\S+
Attempts:
3 left
💡 Hint
Common Mistakes
Using '\w+' which matches word characters, not spaces.
Using '\d+' which matches digits, not spaces.
4fill in blank
hard

Fill both blanks to define a pattern tokenizer that splits on commas and spaces.

Elasticsearch
{
  "settings": {
    "analysis": {
      "tokenizer": {
        "comma_space_tokenizer": {
          "type": "[1]",
          "pattern": "[2]"
        }
      }
    }
  }
}
Drag options to blanks, or click blank then click option'
Apattern
B\\s*,\\s*
C\\s+
Dwhitespace
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'whitespace' type with a pattern field.
Using a pattern that does not match commas.
5fill in blank
hard

Fill all three blanks to define a custom analyzer using the standard tokenizer and lowercase filter.

Elasticsearch
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "tokenizer": "[1]",
          "filter": ["[2]", "[3]"]
        }
      }
    }
  }
}
Drag options to blanks, or click blank then click option'
Astandard
Blowercase
Cstop
Dwhitespace
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'whitespace' tokenizer instead of 'standard'.
Omitting the 'stop' filter when filtering stop words is needed.