0
0
ElasticsearchHow-ToIntermediate · 4 min read

How to Use Point in Time for Pagination in Elasticsearch

Use point_in_time (PIT) in Elasticsearch to create a consistent snapshot of your data for pagination. Open a PIT with POST /_search/point_in_time, then use the returned pit_id in your search queries with search_after to paginate reliably without missing or duplicating results.
📐

Syntax

The basic syntax involves three parts:

  • Open a PIT: Request a point in time snapshot to get a pit_id.
  • Search with PIT: Use the pit_id in your search query along with search_after for pagination.
  • Close the PIT: Optionally close the PIT to free resources.

This keeps your pagination consistent even if data changes during navigation.

json
POST /_search/point_in_time
{
  "index": "your-index-name",
  "keep_alive": "1m"
}

POST /your-index-name/_search
{
  "size": 10,
  "pit": {
    "id": "<pit_id_from_previous_response>",
    "keep_alive": "1m"
  },
  "search_after": ["<sort_value_from_last_hit>"],
  "sort": [
    {"timestamp": "asc"},
    {"_shard_doc": "asc"}
  ]
}

DELETE /_search/point_in_time
{
  "id": "<pit_id>"
}
💻

Example

This example shows how to open a PIT, perform a paginated search using search_after, and then close the PIT.

json
POST /_search/point_in_time
{
  "index": "products",
  "keep_alive": "2m"
}

# Response:
# {
#   "pit_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA..."
# }

POST /products/_search
{
  "size": 5,
  "pit": {
    "id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA...",
    "keep_alive": "2m"
  },
  "sort": [
    {"price": "asc"},
    {"_shard_doc": "asc"}
  ]
}

# Assume last hit sort values: [19.99, 12345]

POST /products/_search
{
  "size": 5,
  "pit": {
    "id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA...",
    "keep_alive": "2m"
  },
  "search_after": [19.99, 12345],
  "sort": [
    {"price": "asc"},
    {"_shard_doc": "asc"}
  ]
}

DELETE /_search/point_in_time
{
  "id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA..."
}
Output
{ "pit_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA..." } { "hits": { "hits": [ {"_id": "1", "sort": [9.99, 10001]}, {"_id": "2", "sort": [14.99, 10002]}, {"_id": "3", "sort": [19.99, 12345]}, {"_id": "4", "sort": [24.99, 10004]}, {"_id": "5", "sort": [29.99, 10005]} ] } }
⚠️

Common Pitfalls

  • Not using search_after with PIT: This causes repeated results or missing pages.
  • Not sorting on unique fields: Always include a tiebreaker like _shard_doc to ensure stable sorting.
  • Forgetting to keep PIT alive: PIT expires after keep_alive time; extend it if pagination takes longer.
  • Not closing PIT: While optional, closing PIT frees cluster resources.
json
POST /products/_search
{
  "size": 5,
  "pit": {
    "id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA...",
    "keep_alive": "1m"
  },
  "sort": [
    {"price": "asc"}
  ]
  # Missing _shard_doc as tiebreaker
}

# Right way:
POST /products/_search
{
  "size": 5,
  "pit": {
    "id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAA...",
    "keep_alive": "1m"
  },
  "sort": [
    {"price": "asc"},
    {"_shard_doc": "asc"}
  ]
}
📊

Quick Reference

Point in Time (PIT) Pagination Cheat Sheet:

StepActionNotes
1Open PITUse POST /_search/point_in_time with index and keep_alive
2Search with PITInclude pit_id and search_after with stable sort
3PaginateUse last hit's sort values in search_after
4Close PITOptional but recommended to free resources
StepActionNotes
1Open PITUse POST /_search/point_in_time with index and keep_alive
2Search with PITInclude pit_id and search_after with stable sort
3PaginateUse last hit's sort values in search_after
4Close PITOptional but recommended to free resources

Key Takeaways

Use point in time (PIT) to get a consistent snapshot for pagination in Elasticsearch.
Always combine PIT with search_after and a stable sort including a tiebreaker like _shard_doc.
Set and extend keep_alive to keep the PIT valid during pagination.
Close the PIT when done to free cluster resources.
Avoid using from/size for deep pagination; PIT with search_after is more efficient.