Reindexing data in Elasticsearch - Time & Space Complexity
When reindexing data in Elasticsearch, it is important to understand how the time to complete the task grows as the amount of data increases.
We want to know how the number of documents affects the time it takes to copy them from one index to another.
Analyze the time complexity of the following code snippet.
POST _reindex
{
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index"
}
}
This code copies all documents from "old_index" to "new_index" in Elasticsearch.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Reading and writing each document from the source index to the destination index.
- How many times: Once for every document in the source index.
As the number of documents grows, the time to reindex grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 document reads and writes |
| 100 | About 100 document reads and writes |
| 1000 | About 1000 document reads and writes |
Pattern observation: The work grows linearly as the number of documents increases.
Time Complexity: O(n)
This means the time to reindex grows directly with the number of documents you have.
[X] Wrong: "Reindexing time stays the same no matter how many documents there are."
[OK] Correct: Each document must be copied individually, so more documents mean more work and more time.
Understanding how reindexing scales helps you explain how Elasticsearch handles large data migrations and why performance matters in real projects.
"What if we added a query to reindex only a subset of documents? How would the time complexity change?"