Deleting documents in Elasticsearch - Time & Space Complexity
When deleting documents in Elasticsearch, it's important to know how the time taken changes as more documents are involved.
We want to understand how the cost grows when deleting one or many documents.
Analyze the time complexity of the following code snippet.
POST /my-index/_delete_by_query
{
"query": {
"term": { "status": "inactive" }
}
}
This code deletes all documents in "my-index" where the status is "inactive".
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Elasticsearch scans documents matching the query and deletes each one.
- How many times: Once for each matching document in the index.
As the number of documents matching the query grows, the number of delete operations grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 delete checks and deletes |
| 100 | About 100 delete checks and deletes |
| 1000 | About 1000 delete checks and deletes |
Pattern observation: The work grows roughly in direct proportion to the number of documents to delete.
Time Complexity: O(n)
This means the time to delete documents grows linearly with how many documents match the query.
[X] Wrong: "Deleting documents is always instant regardless of how many match."
[OK] Correct: Each matching document must be found and deleted, so more matches mean more work and more time.
Understanding how deletion scales helps you explain performance in real systems and shows you can think about costs beyond just writing code.
"What if we changed the query to delete documents by ID one at a time? How would the time complexity change?"