Bird
Raised Fist0
Elasticsearchquery~3 mins

Why Search after for efficient pagination in Elasticsearch? - Purpose & Use Cases

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
The Big Idea

What if you could flip through millions of items instantly without waiting?

The Scenario

Imagine you have a huge list of products in an online store. You want to show customers page by page, but each time you ask the system to find page 10 or 20, it has to count and skip all the previous items first.

The Problem

This manual way is slow because the system must look through all earlier pages again and again. It also uses a lot of memory and can cause delays, making customers wait longer and feel frustrated.

The Solution

Using search after lets the system remember where it left off. Instead of counting from the start every time, it jumps directly to the next set of results, making pagination fast and smooth even with millions of items.

Before vs After
Before
GET /products/_search
{
  "from": 1000, "size": 10,
  "query": { "match_all": {} }
}
After
GET /products/_search
{
  "size": 10,
  "search_after": ["last_sort_value"],
  "sort": ["price"]
}
What It Enables

This technique enables lightning-fast page navigation through huge data sets without slowing down or crashing.

Real Life Example

Think of scrolling through thousands of social media posts or product reviews where you want instant loading of the next page without waiting.

Key Takeaways

Manual pagination with offsets is slow and resource-heavy.

Search after jumps directly to the next page using the last item's sort value.

This makes browsing large data sets fast and user-friendly.

Practice

(1/5)
1. What is the main purpose of using search_after in Elasticsearch pagination?
easy
A. To filter documents based on a query
B. To sort documents alphabetically by default
C. To efficiently paginate through large result sets without performance loss
D. To update documents in bulk

Solution

  1. Step 1: Understand pagination challenges

    Deep pagination with large result sets can be slow and inefficient using traditional methods like from and size.
  2. Step 2: Role of search_after

    search_after uses the last sort values from the previous page to fetch the next page efficiently, avoiding performance issues.
  3. Final Answer:

    To efficiently paginate through large result sets without performance loss -> Option C
  4. Quick Check:

    Purpose of search_after = Efficient pagination [OK]
Hint: Remember: search_after uses last sort values for fast paging [OK]
Common Mistakes:
  • Confusing search_after with filtering
  • Thinking search_after sorts results automatically
  • Using search_after without sorting
2. Which of the following is the correct syntax snippet to use search_after in an Elasticsearch query?
easy
A. "search_after": ["last_sort_value"]
B. "search_after": "last_sort_value"
C. "search_after": {"value": "last_sort_value"}
D. "search_after": true

Solution

  1. Step 1: Check the expected data type for search_after

    The search_after parameter expects an array of sort values, not a single string or object.
  2. Step 2: Match syntax with correct format

    "search_after": ["last_sort_value"] correctly shows search_after as an array with the last sort value inside.
  3. Final Answer:

    "search_after": ["last_sort_value"] -> Option A
  4. Quick Check:

    search_after syntax = array of values [OK]
Hint: search_after always takes an array of sort values [OK]
Common Mistakes:
  • Passing a single string instead of an array
  • Using an object instead of an array
  • Setting search_after to a boolean
3. Given this Elasticsearch query snippet, what will be the effect of adding "search_after": [1627891234567]?
{
  "size": 5,
  "sort": [{"timestamp": "asc"}],
  "search_after": [1627891234567]
}
medium
A. It causes a syntax error because search_after is not allowed here
B. It returns the first 5 documents sorted by timestamp ascending
C. It returns 5 documents with timestamp less than or equal to 1627891234567
D. It returns 5 documents with timestamp strictly greater than 1627891234567

Solution

  1. Step 1: Understand sorting and search_after usage

    The query sorts documents by timestamp ascending and uses search_after with a timestamp value.
  2. Step 2: Effect of search_after value

    search_after tells Elasticsearch to return documents after the given sort value, so only documents with timestamp greater than 1627891234567 are returned.
  3. Final Answer:

    It returns 5 documents with timestamp strictly greater than 1627891234567 -> Option D
  4. Quick Check:

    search_after filters results after given sort value [OK]
Hint: search_after returns results after the given sort values [OK]
Common Mistakes:
  • Thinking it returns documents before the value
  • Assuming it returns the first page always
  • Confusing search_after with from/size pagination
4. You wrote this Elasticsearch query to paginate results:
{
  "size": 10,
  "sort": [{"date": "desc"}],
  "search_after": "2023-01-01T00:00:00"
}
But it returns an error. What is the likely cause?
medium
A. size cannot be 10 with search_after
B. search_after value must be an array, not a string
C. sort order must be ascending for search_after
D. date field cannot be used in sort

Solution

  1. Step 1: Check the type of search_after value

    The search_after parameter requires an array of values, but here it is a string.
  2. Step 2: Identify the error cause

    Passing a string instead of an array causes a syntax error in the query.
  3. Final Answer:

    search_after value must be an array, not a string -> Option B
  4. Quick Check:

    search_after requires array input [OK]
Hint: Always wrap search_after values in an array [OK]
Common Mistakes:
  • Passing single value without array brackets
  • Using unsupported sort order
  • Misunderstanding size limits with search_after
5. You want to paginate through a large dataset sorted by user_id (ascending) and then timestamp (descending). Which search_after value correctly fetches the next page after user_id=42 and timestamp=1680000000?
hard
A. [42, 1680000000]
B. [42, -1680000000]
C. [1680000000, 42]
D. ["42", "1680000000"]

Solution

  1. Step 1: Understand sort order and search_after values

    The sort is by user_id ascending, then timestamp descending. The search_after array must match this order.
  2. Step 2: Match values to sort order

    The correct search_after is an array with user_id first, then timestamp. Since timestamp is descending, the value is used as is (no negation).
  3. Final Answer:

    [42, 1680000000] -> Option A
  4. Quick Check:

    search_after array matches sort fields order [OK]
Hint: search_after array order matches sort fields order exactly [OK]
Common Mistakes:
  • Reversing order of values in search_after
  • Negating timestamp for descending sort
  • Using strings instead of numbers without need