0
0
Elasticsearchquery~15 mins

Completion suggester in Elasticsearch - Deep Dive

Choose your learning style9 modes available
Overview - Completion suggester
What is it?
The Completion suggester is a feature in Elasticsearch that helps provide fast, real-time search suggestions as you type. It is designed to autocomplete words or phrases based on indexed data, making search experiences smoother and quicker. It works by indexing terms in a special way to allow instant prefix matching. This helps users find what they want even if they only type part of a word or phrase.
Why it matters
Without the Completion suggester, search systems would be slower and less helpful when users type queries. Users might have to type full words or guess exact spellings, leading to frustration and missed results. The Completion suggester solves this by predicting and showing possible completions instantly, improving user experience and increasing engagement on websites or apps that rely on search.
Where it fits
Before learning about the Completion suggester, you should understand basic Elasticsearch concepts like indexing, documents, and queries. After mastering it, you can explore more advanced search features like fuzzy matching, phrase suggestions, and custom scoring to build powerful search applications.
Mental Model
Core Idea
The Completion suggester quickly predicts and completes search terms by indexing prefixes for instant lookup.
Think of it like...
It's like a smart text message keyboard that guesses your next word as you type, showing suggestions instantly so you can pick one without typing the whole word.
┌─────────────────────────────┐
│ User types prefix: "app"    │
└─────────────┬───────────────┘
              │
              ▼
┌─────────────────────────────┐
│ Completion suggester looks  │
│ up indexed prefixes starting │
│ with "app"                 │
└─────────────┬───────────────┘
              │
              ▼
┌─────────────────────────────┐
│ Returns suggestions like:   │
│ "apple", "application"   │
│ "appetite"                │
└─────────────────────────────┘
Build-Up - 7 Steps
1
FoundationWhat is Completion Suggester
🤔
Concept: Introduction to the Completion suggester and its purpose in Elasticsearch.
The Completion suggester is a special type of search feature in Elasticsearch that helps users get suggestions as they type. It indexes data in a way that allows very fast prefix matching, meaning it can quickly find words that start with the letters typed so far. This is useful for autocompleting search queries.
Result
You understand that the Completion suggester is designed to speed up search by predicting possible completions.
Understanding the Completion suggester's role helps you see how search can be made faster and more user-friendly.
2
FoundationSetting Up Completion Field
🤔
Concept: How to define a completion field in an Elasticsearch index mapping.
To use the Completion suggester, you must define a field in your index mapping with the type 'completion'. For example: { "mappings": { "properties": { "suggest": { "type": "completion" } } } } This tells Elasticsearch to index this field specially for fast prefix lookups.
Result
The index is ready to store data that can be used for autocomplete suggestions.
Knowing how to set up the completion field is essential because it changes how data is stored and searched.
3
IntermediateIndexing Data for Suggestions
🤔
Concept: How to add documents with completion fields to enable suggestions.
When adding documents, you provide values for the completion field. For example: { "suggest": "apple" } Elasticsearch indexes this value so it can quickly find it when a user types a prefix like 'app'. You can also add multiple suggestions or weights to influence ranking.
Result
Documents are indexed with suggestion data, ready to be queried for autocomplete.
Understanding how data is indexed for suggestions helps you prepare your data for effective autocomplete.
4
IntermediateQuerying with Completion Suggester
🤔Before reading on: Do you think the completion suggester query returns full documents or just suggestion strings? Commit to your answer.
Concept: How to write a query to get autocomplete suggestions from Elasticsearch.
You query the completion suggester using a special query structure: { "suggest": { "song-suggest": { "prefix": "app", "completion": { "field": "suggest" } } } } This asks Elasticsearch to return suggestions starting with 'app'. The response contains suggestion strings, not full documents.
Result
You get a list of suggestion strings matching the prefix typed.
Knowing that the query returns suggestions, not full documents, clarifies how to use the results in your app.
5
IntermediateUsing Weights to Rank Suggestions
🤔Before reading on: Do you think all suggestions are ranked alphabetically or can you control their order? Commit to your answer.
Concept: How to assign weights to suggestions to influence their order in results.
You can add a 'weight' to each suggestion when indexing: { "suggest": { "input": ["apple"], "weight": 10 } } Higher weight means the suggestion appears higher in the list. This helps prioritize popular or important suggestions.
Result
Suggestions are ranked by weight, improving relevance for users.
Understanding weights lets you control suggestion order, making autocomplete smarter.
6
AdvancedHandling Multiple Inputs and Contexts
🤔Before reading on: Can a single document provide multiple different suggestions or contexts? Commit to your answer.
Concept: How to index multiple inputs and use contexts for filtering suggestions.
You can provide multiple inputs for one suggestion: { "suggest": { "input": ["apple", "applesauce"], "weight": 5 } } Also, you can add contexts (like categories) to filter suggestions based on user needs, e.g., only show fruit-related suggestions.
Result
More flexible and targeted suggestions based on multiple inputs and filters.
Knowing how to use multiple inputs and contexts allows building personalized and precise autocomplete.
7
ExpertInternals and Performance Optimizations
🤔Before reading on: Do you think the completion suggester uses the same inverted index as normal search? Commit to your answer.
Concept: Understanding how the completion suggester stores data internally and how to optimize it.
The completion suggester uses a Finite State Transducer (FST) data structure, different from the normal inverted index. This allows very fast prefix lookups with low memory. However, it requires more memory upfront and careful tuning of index size and refresh intervals to balance speed and resource use.
Result
You understand the tradeoffs and internal workings that affect performance and resource use.
Knowing the internal FST structure explains why completion suggester is fast and how to optimize it for production.
Under the Hood
The Completion suggester builds a Finite State Transducer (FST) from the indexed completion fields. This FST is a compact, memory-efficient data structure that maps prefixes to possible completions. When a user types a prefix, Elasticsearch quickly traverses the FST to find matching suggestions without scanning the entire index. This differs from normal inverted indexes which map terms to documents, making completion much faster for prefix queries.
Why designed this way?
The FST-based design was chosen to optimize for speed and low latency in autocomplete scenarios. Traditional inverted indexes are slower for prefix matching because they require scanning many terms. The FST compresses common prefixes and allows instant lookup, which is critical for real-time user experiences. Alternatives like n-gram indexes were less efficient or produced less relevant suggestions.
┌───────────────┐
│ Indexed Terms │
│ "apple"      │
│ "application"│
│ "appetite"   │
└──────┬────────┘
       │
       ▼
┌─────────────────────────────┐
│ Build Finite State Transducer│
│ (FST) compressing prefixes  │
└─────────────┬───────────────┘
              │
              ▼
┌─────────────────────────────┐
│ User types prefix "app"     │
│ Traverse FST to find matches │
└─────────────┬───────────────┘
              │
              ▼
┌─────────────────────────────┐
│ Return suggestions:          │
│ "apple", "application"   │
│ "appetite"                │
└─────────────────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does the completion suggester return full documents or just suggestion strings? Commit to your answer.
Common Belief:The completion suggester returns full documents matching the prefix.
Tap to reveal reality
Reality:It returns only suggestion strings, not full documents. You must query documents separately if needed.
Why it matters:Expecting full documents can lead to confusion and incorrect application logic when integrating suggestions.
Quick: Can you use fuzzy matching with the completion suggester? Commit to yes or no.
Common Belief:The completion suggester supports fuzzy matching to correct typos automatically.
Tap to reveal reality
Reality:It does not support fuzzy matching; it only matches prefixes exactly. For typos, other suggesters or query types are needed.
Why it matters:Assuming fuzzy matching works can cause poor user experience when typos are not handled.
Quick: Does adding more suggestions always improve performance? Commit to yes or no.
Common Belief:More suggestions indexed always improve autocomplete quality without downsides.
Tap to reveal reality
Reality:Adding many suggestions increases memory use and index size, which can slow down performance if not managed.
Why it matters:Ignoring resource limits can cause slow searches or system crashes in production.
Quick: Is the completion suggester suitable for very large datasets without tuning? Commit to yes or no.
Common Belief:It works well out-of-the-box for any dataset size without special tuning.
Tap to reveal reality
Reality:Large datasets require tuning of refresh intervals, memory settings, and possibly sharding to maintain performance.
Why it matters:Not tuning can lead to slow or unstable autocomplete in real-world large-scale systems.
Expert Zone
1
The FST structure is immutable after index refresh, so suggestions reflect the last refresh state, not real-time inserts.
2
Weights are relative and only influence suggestion order within the same prefix; they do not guarantee global ranking.
3
Context filtering can be combined with completion suggester to build multi-dimensional autocomplete, but it adds complexity and memory overhead.
When NOT to use
Avoid using the completion suggester when you need fuzzy or typo-tolerant suggestions; instead, use phrase or term suggesters. Also, for very large datasets with frequent updates, consider search-as-you-type with n-grams or edge n-grams for more flexible matching.
Production Patterns
In production, completion suggester is often combined with context filters to personalize suggestions by user location or category. It is also common to precompute weights based on popularity or recency to improve relevance. Index refresh intervals are tuned to balance freshness and performance.
Connections
Trie Data Structure
The completion suggester's FST is a compressed form of a trie used for prefix matching.
Understanding tries from computer science helps grasp how prefix searches are optimized in Elasticsearch.
Autocomplete in Mobile Keyboards
Both use prefix prediction to speed up typing and improve user experience.
Knowing how mobile keyboards predict words helps understand the user value and design goals of completion suggesters.
Finite State Machines in Linguistics
The FST used internally is a type of finite state machine that models sequences efficiently.
Recognizing the connection to finite state machines reveals why the data structure is compact and fast for language tasks.
Common Pitfalls
#1Expecting fuzzy matching with completion suggester.
Wrong approach:{ "suggest": { "song-suggest": { "prefix": "appl", "completion": { "field": "suggest", "fuzzy": {} } } } }
Correct approach:{ "suggest": { "song-suggest": { "prefix": "appl", "completion": { "field": "suggest" } } } }
Root cause:Misunderstanding that completion suggester supports fuzzy matching; it does not.
#2Defining completion field as a normal text field.
Wrong approach:{ "mappings": { "properties": { "suggest": { "type": "text" } } } }
Correct approach:{ "mappings": { "properties": { "suggest": { "type": "completion" } } } }
Root cause:Not using the special 'completion' type prevents fast prefix lookups.
#3Expecting suggestions to update instantly after indexing a document.
Wrong approach:Index document with suggestion, then immediately query and expect it to appear.
Correct approach:Index document, then wait for index refresh or force refresh before querying suggestions.
Root cause:Not understanding that suggestions reflect the last index refresh state, not real-time inserts.
Key Takeaways
The Completion suggester provides fast, prefix-based autocomplete by indexing data in a special 'completion' field.
It uses a Finite State Transducer internally to quickly find suggestions without scanning the entire index.
Suggestions are returned as strings, not full documents, and can be ranked using weights.
It does not support fuzzy matching, so other suggesters are needed for typo tolerance.
Proper setup, tuning, and understanding of its limits are essential for building effective autocomplete features.