Search after for efficient pagination in Elasticsearch - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When using Elasticsearch to get many pages of results, how fast the search runs matters a lot.
We want to know how the time to get results changes as we ask for more pages using the search after method.
Analyze the time complexity of the following code snippet.
GET /my-index/_search
{
"size": 10,
"query": { "match_all": {} },
"sort": [ { "timestamp": "asc" }, { "_id": "asc" } ],
"search_after": ["2023-01-01T00:00:00", "abc123"]
}
This query fetches 10 results after a given sort position, using search after for pagination.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Elasticsearch scans documents sorted by the given fields to find the next page.
- How many times: For each page requested, it performs a search starting after the last result of the previous page.
As you request more pages, the search after method does not re-scan all previous results but continues from the last position.
| Input Size (pages) | Approx. Operations |
|---|---|
| 10 | About 10 times the cost of one page scan |
| 100 | About 100 times the cost of one page scan |
| 1000 | About 1000 times the cost of one page scan |
Pattern observation: The cost grows linearly with the number of pages requested.
Time Complexity: O(n)
This means the time to get results grows directly in proportion to how many pages you fetch.
[X] Wrong: "Using search after means the query time stays the same no matter how many pages I fetch."
[OK] Correct: Each new page requires a new search starting after the last result, so the total time adds up with more pages.
Understanding how search after scales helps you explain efficient pagination in Elasticsearch, a useful skill for real projects and interviews.
"What if we replaced search after with from and size for pagination? How would the time complexity change?"
Practice
search_after in Elasticsearch pagination?Solution
Step 1: Understand pagination challenges
Deep pagination with large result sets can be slow and inefficient using traditional methods likefromandsize.Step 2: Role of
search_aftersearch_afteruses the last sort values from the previous page to fetch the next page efficiently, avoiding performance issues.Final Answer:
To efficiently paginate through large result sets without performance loss -> Option CQuick Check:
Purpose of search_after = Efficient pagination [OK]
- Confusing search_after with filtering
- Thinking search_after sorts results automatically
- Using search_after without sorting
search_after in an Elasticsearch query?Solution
Step 1: Check the expected data type for search_after
Thesearch_afterparameter expects an array of sort values, not a single string or object.Step 2: Match syntax with correct format
"search_after": ["last_sort_value"] correctly showssearch_afteras an array with the last sort value inside.Final Answer:
"search_after": ["last_sort_value"] -> Option AQuick Check:
search_after syntax = array of values [OK]
- Passing a single string instead of an array
- Using an object instead of an array
- Setting search_after to a boolean
"search_after": [1627891234567]?
{
"size": 5,
"sort": [{"timestamp": "asc"}],
"search_after": [1627891234567]
}Solution
Step 1: Understand sorting and search_after usage
The query sorts documents by timestamp ascending and usessearch_afterwith a timestamp value.Step 2: Effect of search_after value
search_aftertells Elasticsearch to return documents after the given sort value, so only documents with timestamp greater than 1627891234567 are returned.Final Answer:
It returns 5 documents with timestamp strictly greater than 1627891234567 -> Option DQuick Check:
search_after filters results after given sort value [OK]
- Thinking it returns documents before the value
- Assuming it returns the first page always
- Confusing search_after with from/size pagination
{
"size": 10,
"sort": [{"date": "desc"}],
"search_after": "2023-01-01T00:00:00"
}
But it returns an error. What is the likely cause?Solution
Step 1: Check the type of search_after value
Thesearch_afterparameter requires an array of values, but here it is a string.Step 2: Identify the error cause
Passing a string instead of an array causes a syntax error in the query.Final Answer:
search_after value must be an array, not a string -> Option BQuick Check:
search_after requires array input [OK]
- Passing single value without array brackets
- Using unsupported sort order
- Misunderstanding size limits with search_after
user_id (ascending) and then timestamp (descending). Which search_after value correctly fetches the next page after user_id=42 and timestamp=1680000000?Solution
Step 1: Understand sort order and search_after values
The sort is byuser_idascending, thentimestampdescending. Thesearch_afterarray must match this order.Step 2: Match values to sort order
The correctsearch_afteris an array withuser_idfirst, thentimestamp. Since timestamp is descending, the value is used as is (no negation).Final Answer:
[42, 1680000000] -> Option AQuick Check:
search_after array matches sort fields order [OK]
- Reversing order of values in search_after
- Negating timestamp for descending sort
- Using strings instead of numbers without need
