0
0
Elasticsearchquery~7 mins

Autocomplete with edge n-gram in Elasticsearch

Choose your learning style9 modes available
Introduction

Autocomplete helps users find words or phrases quickly as they type. Edge n-gram breaks words into smaller parts from the start, making autocomplete fast and accurate.

When you want to suggest search terms as users type in a search box.
When you want to speed up search by matching partial beginnings of words.
When you want to support prefix matching for product names or tags.
When you want to improve user experience by showing instant suggestions.
When you want to handle large text fields but still provide quick autocomplete.
Syntax
Elasticsearch
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete_analyzer": {
          "tokenizer": "autocomplete_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "autocomplete_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 20,
          "token_chars": ["letter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "field_name": {
        "type": "text",
        "analyzer": "autocomplete_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}

The edge_ngram tokenizer breaks words into prefixes starting from the first letter.

Use search_analyzer as standard to avoid breaking search input into n-grams.

Examples
This example sets max_gram to 10, suitable for shorter autocomplete prefixes.
Elasticsearch
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete_analyzer": {
          "tokenizer": "autocomplete_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "autocomplete_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 10,
          "token_chars": ["letter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "product_name": {
        "type": "text",
        "analyzer": "autocomplete_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}
This example includes digits in tokens and starts from 2 characters to avoid too short prefixes.
Elasticsearch
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete_analyzer": {
          "tokenizer": "autocomplete_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "autocomplete_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 15,
          "token_chars": ["letter", "digit"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "username": {
        "type": "text",
        "analyzer": "autocomplete_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}
Edge case: If the field is empty, no tokens are generated, so autocomplete returns no suggestions.
Elasticsearch
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete_analyzer": {
          "tokenizer": "autocomplete_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "autocomplete_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 20,
          "token_chars": ["letter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "empty_field": {
        "type": "text",
        "analyzer": "autocomplete_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}
Sample Program

This program creates an index with edge n-gram autocomplete on product_name. It adds three products and searches for "App" to get autocomplete suggestions starting with "App".

Elasticsearch
PUT /autocomplete_example
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete_analyzer": {
          "tokenizer": "autocomplete_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "autocomplete_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 10,
          "token_chars": ["letter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "product_name": {
        "type": "text",
        "analyzer": "autocomplete_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}

POST /autocomplete_example/_doc/1
{
  "product_name": "Apple MacBook Pro"
}

POST /autocomplete_example/_doc/2
{
  "product_name": "Apple Watch"
}

POST /autocomplete_example/_doc/3
{
  "product_name": "Samsung Galaxy"
}

GET /autocomplete_example/_search
{
  "query": {
    "match": {
      "product_name": "App"
    }
  }
}
OutputSuccess
Important Notes

Time complexity: Indexing is slower due to many tokens from edge n-grams; search is fast because of prefix matching.

Space complexity: Uses more disk space because each word creates multiple tokens.

Common mistake: Using edge n-gram analyzer for search input instead of standard analyzer causes poor search results.

Use edge n-gram for autocomplete prefix matching; use completion suggester for faster but less flexible autocomplete.

Summary

Edge n-gram tokenizer breaks words into prefixes for fast autocomplete.

Use a custom analyzer with edge n-gram tokenizer for indexing and standard analyzer for searching.

Autocomplete improves user experience by suggesting matches as users type.