0
0
Elasticsearchquery~5 mins

Why documents are the unit of data in Elasticsearch - Performance Analysis

Choose your learning style9 modes available
Time Complexity: Why documents are the unit of data
O(n)
Understanding Time Complexity

When working with Elasticsearch, documents are the main pieces of data we store and search.

We want to understand how the time it takes to handle data grows as we add more documents.

Scenario Under Consideration

Analyze the time complexity of indexing multiple documents.


POST /my_index/_bulk
{ "index": { "_id": "1" } }
{ "name": "Alice", "age": 30 }
{ "index": { "_id": "2" } }
{ "name": "Bob", "age": 25 }
{ "index": { "_id": "3" } }
{ "name": "Carol", "age": 27 }
    

This code adds several documents to an index in bulk, each document representing one unit of data.

Identify Repeating Operations

Look at what repeats when adding documents.

  • Primary operation: Indexing each document one by one.
  • How many times: Once per document added.
How Execution Grows With Input

As you add more documents, the work grows with the number of documents.

Input Size (n)Approx. Operations
1010 indexing operations
100100 indexing operations
10001000 indexing operations

Pattern observation: The time grows directly with the number of documents added.

Final Time Complexity

Time Complexity: O(n)

This means the time to index data grows in a straight line as you add more documents.

Common Mistake

[X] Wrong: "Adding more documents takes the same time no matter how many there are."

[OK] Correct: Each document needs its own processing, so more documents mean more work and more time.

Interview Connect

Understanding how data size affects processing time helps you explain how Elasticsearch handles scaling in real projects.

Self-Check

"What if we indexed documents in parallel instead of one by one? How would the time complexity change?"