Text analysis helps break down and understand words in documents. This makes searching smarter by matching words even if they look different.
0
0
Why text analysis enables smart search in Elasticsearch
Introduction
When you want users to find documents even if they type different word forms.
When you need to ignore small differences like uppercase or punctuation in search.
When you want to find related words, like 'run' matching 'running'.
When you want to remove common words like 'the' or 'and' to focus on important words.
When you want to improve search speed by indexing simpler word forms.
Syntax
Elasticsearch
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop", "porter_stem"]
}
}
}
}
}This example shows how to create a custom analyzer in Elasticsearch.
Analyzers break text into tokens and apply filters like lowercase and stemming.
Examples
This runs the custom analyzer on a sample text to see how it breaks down words.
Elasticsearch
GET /my_index/_analyze
{
"analyzer": "my_analyzer",
"text": "Running faster than the wind"
}This creates a simple analyzer that lowercases and splits text by non-letters.
Elasticsearch
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"simple_analyzer": {
"type": "simple"
}
}
}
}
}Sample Program
This example creates an index with a smart analyzer that lowercases, removes stop words, and stems words. It adds a book title and searches using related word forms.
Elasticsearch
PUT /books
{
"settings": {
"analysis": {
"analyzer": {
"smart_search_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "stop", "porter_stem"]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "smart_search_analyzer"
}
}
}
}
POST /books/_doc
{
"title": "Running with the Wind"
}
GET /books/_search
{
"query": {
"match": {
"title": "run wind"
}
}
}OutputSuccess
Important Notes
Text analysis improves search by making word matching flexible and relevant.
Stop words are common words that usually do not add meaning and can be ignored.
Stemming reduces words to their root form to match different word variations.
Summary
Text analysis breaks text into smaller parts for better searching.
It helps find words even if they look different or have extra words.
Using analyzers makes search smarter and faster in Elasticsearch.