Indexing a document (POST/PUT) in Elasticsearch - Time & Space Complexity
When we add or update a document in Elasticsearch, it takes some time to process. Understanding how this time grows helps us plan for bigger data.
We want to know: How does the time to index a document change as the document size or system load grows?
Analyze the time complexity of the following code snippet.
POST /my_index/_doc/1
{
"user": "alice",
"message": "Hello Elasticsearch!",
"timestamp": "2024-06-01T12:00:00Z"
}
This code adds a new document with some fields into the index called my_index.
Look at what happens when indexing:
- Primary operation: Parsing and analyzing each field in the document.
- How many times: Once per field, so the number of fields matters.
As the document gets bigger with more fields or larger text, the work grows roughly in direct proportion.
| Input Size (fields) | Approx. Operations |
|---|---|
| 10 | 10 units of parsing and indexing |
| 100 | 100 units of parsing and indexing |
| 1000 | 1000 units of parsing and indexing |
Pattern observation: Doubling the number of fields roughly doubles the work needed.
Time Complexity: O(n)
This means the time to index grows linearly with the number of fields or size of the document.
[X] Wrong: "Indexing a document always takes the same time no matter its size."
[OK] Correct: Larger documents have more fields and text to process, so they take more time to index.
Knowing how indexing time grows helps you design efficient data flows and handle bigger data smoothly. This skill shows you understand real system behavior.
"What if we indexed documents with nested objects or arrays? How would the time complexity change?"