0
0
ElasticsearchDebug / FixIntermediate · 3 min read

How to Avoid Deep Pagination in Elasticsearch Efficiently

Deep pagination in Elasticsearch happens when you use large from and size values in queries, which slows down search performance. To avoid this, use search_after for efficient cursor-based pagination or the scroll API for large result sets instead of deep from offsets.
🔍

Why This Happens

Elasticsearch uses from and size parameters to paginate results. When from is very large, Elasticsearch must skip many documents internally, which is slow and resource-heavy. This causes slow queries and high memory use.

json
{
  "query": {
    "match_all": {}
  },
  "from": 10000,
  "size": 10
}
Output
Warning: Deep pagination with high 'from' value causes slow response and high memory usage.
🔧

The Fix

Replace deep pagination using from with search_after. This uses the last document's sort values to fetch the next page efficiently. For very large data retrieval, use the scroll API which keeps a context open and streams results.

json
{
  "query": {
    "match_all": {}
  },
  "size": 10,
  "sort": [
    {"_id": "asc"}
  ],
  "search_after": ["last_doc_sort_value"]
}
Output
Returns next 10 documents after the document with sort value 'last_doc_sort_value' efficiently.
🛡️

Prevention

To avoid deep pagination issues, always use search_after for paging beyond the first few pages. Use scroll for exporting or processing large datasets. Avoid relying on large from offsets in your queries. Design your UI to load data incrementally and cache results when possible.

⚠️

Related Errors

Common related issues include timeout errors due to slow queries and memory pressure on Elasticsearch nodes. These often happen with deep pagination. Fixes include increasing timeouts, optimizing queries, or switching to search_after and scroll.

Key Takeaways

Avoid using large 'from' values for pagination in Elasticsearch queries.
Use 'search_after' for efficient cursor-based pagination beyond the first pages.
Use the 'scroll' API to retrieve large sets of data without performance loss.
Design applications to load data incrementally and cache results to reduce deep pagination.
Deep pagination causes slow queries and high memory use, so always optimize your paging strategy.