0
0
Elasticsearchquery~30 mins

Analyzer components (tokenizer, filters) in Elasticsearch - Mini Project: Build & Apply

Choose your learning style9 modes available
Build a Custom Elasticsearch Analyzer with Tokenizer and Filters
📖 Scenario: You are setting up a search engine for a small online bookstore. You want to create a custom analyzer in Elasticsearch that breaks text into words and processes them to improve search results.
🎯 Goal: Create a custom Elasticsearch analyzer using a tokenizer and filters. You will define the tokenizer and filters, then combine them into an analyzer, and finally test the analyzer to see the processed tokens.
📋 What You'll Learn
Create an index with a custom analyzer named my_custom_analyzer
Use the standard tokenizer in the analyzer
Add a lowercase filter named lowercase
Add a stop filter named stop with stop words the, and, is
Test the analyzer with the text "The quick Brown fox jumps over the lazy Dog and is happy"
💡 Why This Matters
🌍 Real World
Custom analyzers help improve search quality by controlling how text is broken down and filtered before searching.
💼 Career
Understanding analyzers is key for roles in search engineering, backend development, and data engineering working with Elasticsearch.
Progress0 / 4 steps
1
Create the index with a custom analyzer using the standard tokenizer
Write the JSON to create an Elasticsearch index named books with a custom analyzer called my_custom_analyzer that uses the standard tokenizer.
Elasticsearch
Need a hint?

Define the my_custom_analyzer inside analysis.analyzer with the standard tokenizer.

2
Add lowercase and stop filters to the analyzer
Add the filters lowercase and stop to the my_custom_analyzer. Define the stop filter with stop words the, and, and is inside analysis.filter.
Elasticsearch
Need a hint?

Define the stop inside analysis.filter and add both lowercase and stop to the analyzer's filter list.

3
Test the custom analyzer with sample text
Write the JSON query to analyze the text "The quick Brown fox jumps over the lazy Dog and is happy" using the my_custom_analyzer on the books index.
Elasticsearch
Need a hint?

Use the _analyze API with the analyzer set to my_custom_analyzer and provide the sample text.

4
Display the tokens produced by the analyzer
Print the tokens returned by the analyzer for the text "The quick Brown fox jumps over the lazy Dog and is happy". The output should list tokens without the stop words and all in lowercase.
Elasticsearch
Need a hint?

The tokens should be lowercase and exclude the, and, is.