0
0
Elasticsearchquery~5 mins

Enrich processor in Elasticsearch - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Enrich processor
O(n)
Understanding Time Complexity

When using the enrich processor in Elasticsearch, it's important to understand how the time to process documents changes as the number of documents grows.

We want to know how the processor's work increases when more documents need enrichment.

Scenario Under Consideration

Analyze the time complexity of the following enrich processor configuration snippet.


POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "enrich": {
          "policy_name": "user_policy",
          "field": "user_id",
          "target_field": "user_info"
        }
      }
    ]
  },
  "docs": [
    {"_source": {"user_id": "123"}}
  ]
}
    

This snippet simulates enriching documents by looking up user info based on user_id using a stored enrich policy.

Identify Repeating Operations

In this process, the main repeating operation is the lookup for each document.

  • Primary operation: For each document, the enrich processor performs a lookup in the enrich index.
  • How many times: Once per document being processed.
How Execution Grows With Input

As the number of documents increases, the total number of lookups grows linearly.

Input Size (n)Approx. Operations (lookups)
1010 lookups
100100 lookups
10001000 lookups

Pattern observation: Each new document adds one more lookup, so the work grows steadily with the number of documents.

Final Time Complexity

Time Complexity: O(n)

This means the time to enrich documents grows directly in proportion to how many documents you process.

Common Mistake

[X] Wrong: "The enrich processor does all lookups once and reuses results, so time stays the same no matter how many documents."

[OK] Correct: Each document requires its own lookup because different documents have different keys to enrich, so the processor must do work for each one.

Interview Connect

Understanding how the enrich processor scales helps you explain how Elasticsearch handles data enrichment efficiently, a useful skill when discussing data pipelines and search performance.

Self-Check

"What if the enrich index was cached in memory for faster lookups? How would that affect the time complexity?"