0
0
Elasticsearchquery~15 mins

Pagination (from/size) in Elasticsearch - Deep Dive

Choose your learning style9 modes available
Overview - Pagination (from/size)
What is it?
Pagination in Elasticsearch using from and size is a way to split search results into smaller parts or pages. The 'from' parameter tells Elasticsearch where to start in the list of results, and the 'size' parameter tells how many results to show. This helps users see results in chunks instead of all at once. It is useful when there are many results and you want to browse them page by page.
Why it matters
Without pagination, users would have to load all search results at once, which can be slow and overwhelming. Pagination improves performance and user experience by loading only a small set of results at a time. It also helps save resources on the server and client side. This makes searching large datasets practical and efficient.
Where it fits
Before learning pagination, you should understand basic Elasticsearch search queries and how results are returned. After mastering pagination, you can learn about more advanced techniques like search_after or scroll for deep pagination and real-time data handling.
Mental Model
Core Idea
Pagination with from and size tells Elasticsearch which slice of the total search results to return, like choosing a page from a book.
Think of it like...
Imagine a big book with many pages. You want to read only one page at a time. 'From' is like the page number you start reading from, and 'size' is how many pages you read at once. This way, you don’t have to carry the whole book, just the pages you need.
Search Results (Total 1000)
┌───────────────┐
│ Page 1       │ from=0, size=10  │ Results 1-10  │
├───────────────┤
│ Page 2       │ from=10, size=10 │ Results 11-20 │
├───────────────┤
│ Page 3       │ from=20, size=10 │ Results 21-30 │
└───────────────┘
Build-Up - 6 Steps
1
FoundationBasic Search Result Structure
🤔
Concept: Understanding how Elasticsearch returns search results in a list.
When you run a search query in Elasticsearch, it returns a list of matching documents. By default, it returns the first 10 results. These results are ordered by relevance or other criteria you set.
Result
You get a list of up to 10 documents matching your query.
Knowing the default behavior helps you realize why you need pagination to see more than the first 10 results.
2
FoundationIntroducing from and size Parameters
🤔
Concept: Learn how to control which part of the results Elasticsearch returns.
The 'from' parameter tells Elasticsearch how many results to skip before starting to return results. The 'size' parameter tells how many results to return after skipping. For example, from=10 and size=5 returns results 11 to 15.
Result
You get a specific slice of the search results, not just the first 10.
Understanding these parameters lets you control which page of results you see.
3
IntermediateUsing from/size for Pagination
🤔Before reading on: do you think increasing 'from' by 'size' each time will correctly show the next page of results? Commit to your answer.
Concept: How to use from and size together to show pages of results.
To paginate, set size to the number of results per page. For page 1, use from=0; for page 2, from=size; for page 3, from=2*size, and so on. This way, each page shows the next set of results without overlap.
Result
You can navigate through pages of results by changing from and size.
Knowing this pattern is the foundation of simple pagination in Elasticsearch.
4
IntermediatePerformance Impact of Large from Values
🤔Before reading on: do you think using a very large 'from' value is fast or slow? Commit to your answer.
Concept: Understanding how large 'from' values affect query speed.
Elasticsearch must skip over all documents before the 'from' position, which takes more time as 'from' grows. Large 'from' values cause slower queries and higher memory use, making deep pagination inefficient.
Result
Queries with large 'from' values become slower and more resource-heavy.
Knowing this helps you avoid performance problems when paginating deep into results.
5
AdvancedAlternatives for Deep Pagination
🤔Before reading on: do you think from/size is the best way to paginate very deep results? Commit to your answer.
Concept: Learn about search_after and scroll as better options for deep pagination.
For deep pagination, use search_after with sort values or scroll API. These methods avoid the cost of skipping many results by remembering the last position or keeping a snapshot of results. They are more efficient for large result sets.
Result
You get faster and more scalable pagination for deep pages.
Understanding alternatives prevents misuse of from/size and improves system performance.
6
ExpertInternal Mechanics of from/size Pagination
🤔Before reading on: do you think Elasticsearch fetches only requested results internally or processes all matches first? Commit to your answer.
Concept: How Elasticsearch processes queries with from and size internally.
Elasticsearch collects all matching documents, sorts them, then skips 'from' documents and returns 'size' documents. This means it processes all matches before slicing, which causes inefficiency for large 'from'.
Result
Understanding this explains why large 'from' values slow down queries.
Knowing the internal process clarifies the limits of from/size and guides better pagination strategies.
Under the Hood
When a search query with from and size runs, Elasticsearch first finds all documents matching the query. It then sorts these documents according to the query's sort rules. After sorting, it skips the first 'from' documents and returns the next 'size' documents. This means the system must process and sort all matches before slicing, which can be costly for large 'from' values.
Why designed this way?
The from/size design is simple and intuitive, matching common pagination patterns in databases and APIs. It allows easy access to any page of results by index. However, this simplicity trades off performance for deep pages. Alternatives like search_after were introduced later to address these limits.
┌───────────────┐
│ Query Matches │
│ (All Docs)    │
└──────┬────────┘
       │ Sort all matches
       ▼
┌───────────────┐
│ Sorted Results│
└──────┬────────┘
       │ Skip 'from' docs
       ▼
┌───────────────┐
│ Return 'size' │
│ documents     │
└───────────────┘
Myth Busters - 3 Common Misconceptions
Quick: Does setting from=1000 and size=10 return results 1001 to 1010 instantly? Commit yes or no.
Common Belief:Using from=1000 and size=10 will quickly return results 1001 to 1010 just like the first page.
Tap to reveal reality
Reality:Elasticsearch must process and sort all 1000 previous results before returning the next 10, making it slower.
Why it matters:Assuming deep pagination is fast leads to slow queries and poor user experience.
Quick: Can from/size pagination guarantee consistent results if data changes between pages? Commit yes or no.
Common Belief:Pagination with from and size always returns consistent results even if data changes during paging.
Tap to reveal reality
Reality:If data changes (documents added, deleted, or updated), pages may shift, causing duplicates or missing results.
Why it matters:Ignoring this can confuse users and cause incorrect data display.
Quick: Is from/size the best method for all pagination needs? Commit yes or no.
Common Belief:From/size pagination is the best and only way to paginate Elasticsearch results.
Tap to reveal reality
Reality:From/size is simple but inefficient for deep pages; search_after or scroll are better for large datasets.
Why it matters:Using from/size for deep pagination wastes resources and slows down systems.
Expert Zone
1
Using from/size with large 'from' values can cause Elasticsearch to consume excessive heap memory, risking out-of-memory errors.
2
Combining from/size with complex sorting or script-based sorting increases query cost exponentially.
3
From/size pagination does not guarantee stable ordering if the sort fields are not unique, leading to inconsistent page results.
When NOT to use
Avoid from/size pagination when you need to access very deep pages (e.g., beyond a few thousand results) or require consistent snapshots of data. Instead, use search_after for efficient deep pagination or scroll API for large batch processing.
Production Patterns
In production, from/size is commonly used for UI pagination on the first few pages. For infinite scroll or deep browsing, search_after is preferred. Scroll is used for exporting or reindexing large datasets. Combining pagination with caching and filters improves performance.
Connections
Cursor-based Pagination
search_after in Elasticsearch is a form of cursor-based pagination.
Understanding cursor-based pagination helps grasp why search_after is more efficient than from/size for deep pages.
Database OFFSET-LIMIT Pagination
From/size in Elasticsearch is similar to OFFSET-LIMIT in SQL databases.
Knowing SQL pagination helps understand the performance tradeoffs of from/size in Elasticsearch.
Memory Paging in Operating Systems
Both Elasticsearch pagination and OS memory paging involve slicing large data sets into manageable chunks.
Recognizing this connection reveals common challenges in managing large data efficiently across fields.
Common Pitfalls
#1Using very large 'from' values for deep pagination without considering performance.
Wrong approach:{ "from": 100000, "size": 10, "query": { "match_all": {} } }
Correct approach:Use search_after with sort values instead of large from: { "size": 10, "search_after": ["last_sort_value"], "sort": ["timestamp"] }
Root cause:Misunderstanding that from skips results efficiently, ignoring the cost of sorting and skipping large numbers of documents.
#2Assuming from/size pagination returns stable results when data changes during paging.
Wrong approach:Paginating with from/size over a live index without handling data changes.
Correct approach:Use scroll API or snapshot techniques to get consistent results during pagination.
Root cause:Not realizing that data mutations cause shifting results and inconsistent pages.
#3Not setting a sort order when paginating, relying on default relevance sorting.
Wrong approach:{ "from": 10, "size": 10, "query": { "match": { "field": "value" } } }
Correct approach:{ "from": 10, "size": 10, "query": { "match": { "field": "value" } }, "sort": ["_doc"] }
Root cause:Ignoring that without explicit sort, pagination can return unpredictable or inconsistent results.
Key Takeaways
Pagination with from and size lets you view search results in manageable pages by skipping and limiting results.
Using large from values slows down queries because Elasticsearch must process all skipped documents first.
For deep pagination, alternatives like search_after or scroll are more efficient and scalable.
Pagination results can be inconsistent if data changes during browsing, so consider snapshot or scroll for stable views.
Understanding the internal mechanics of from/size helps avoid performance pitfalls and choose the right pagination method.