0
0
Elasticsearchquery~30 mins

Token filters (lowercase, stemmer, synonym) in Elasticsearch - Mini Project: Build & Apply

Choose your learning style9 modes available
Create a Custom Elasticsearch Analyzer with Token Filters
📖 Scenario: You are setting up a search engine for a book store. You want to make sure that searches find books even if users type words in different forms or cases. For example, searching for "Running" should find books with "running" or "run". Also, some words have synonyms, like "quick" and "fast".
🎯 Goal: Build an Elasticsearch index with a custom analyzer that uses token filters: lowercase, stemmer, and synonym filters. This will help the search engine understand different word forms and synonyms.
📋 What You'll Learn
Create an index called books.
Define a custom analyzer named custom_analyzer.
Use the standard tokenizer in the analyzer.
Add three token filters to the analyzer: lowercase, english_stemmer, and synonym_filter.
Define the english_stemmer filter as a stemmer for English.
Define the synonym_filter with synonyms: quick,fast and jumps,leaps.
💡 Why This Matters
🌍 Real World
Search engines often need to understand different word forms and synonyms to give better results. This project shows how to set up such features in Elasticsearch.
💼 Career
Many jobs in search engineering, data engineering, and backend development require knowledge of text analysis and Elasticsearch configuration.
Progress0 / 4 steps
1
Create the index with basic settings
Create an Elasticsearch index called books with an empty settings object.
Elasticsearch
Need a hint?

Use the PUT method to create the index books with empty settings.

2
Add the custom analyzer with tokenizer and filters
Inside the settings, add an analysis section. Define a custom analyzer named custom_analyzer that uses the standard tokenizer and the token filters lowercase, english_stemmer, and synonym_filter in that order.
Elasticsearch
Need a hint?

Remember to put the analysis section inside settings. Define the analyzer with the exact name custom_analyzer.

3
Define the stemmer and synonym token filters
Add the token filters english_stemmer and synonym_filter inside the analysis section. Define english_stemmer as a stemmer filter with stemmer type and english language. Define synonym_filter as a synonym filter with synonyms quick,fast and jumps,leaps.
Elasticsearch
Need a hint?

Define the filters inside the filter section under analysis. Use the exact names english_stemmer and synonym_filter.

4
Test the analyzer with a sample text
Use the _analyze API to test the custom_analyzer on the text "The quick brown fox jumps running fast". Print the tokens produced by the analyzer.
Elasticsearch
Need a hint?

Use the POST /books/_analyze API with the custom_analyzer and the given text. The output tokens should include stemmed and synonym words.