Pipeline testing in Elasticsearch - Time & Space Complexity
When testing an Elasticsearch ingest pipeline, we want to know how the time to process data changes as the input grows.
We ask: How does the pipeline's processing time grow when we test it with more documents?
Analyze the time complexity of the following pipeline test request.
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{ "set": { "field": "field1", "value": "value1" } },
{ "uppercase": { "field": "field1" } }
]
},
"docs": [
{ "_source": { "field1": "test" } },
{ "_source": { "field1": "example" } }
]
}
This code tests a pipeline with two processors on two documents to see the output after processing.
Look for repeated actions in the pipeline test.
- Primary operation: Each document is processed through all processors in the pipeline.
- How many times: The number of documents times the number of processors.
As we add more documents, the total processing steps increase proportionally.
| Input Size (n documents) | Approx. Operations (processors x documents) |
|---|---|
| 10 | 20 |
| 100 | 200 |
| 1000 | 2000 |
Pattern observation: Doubling the number of documents doubles the total processing time.
Time Complexity: O(n)
This means the processing time grows linearly with the number of documents tested.
[X] Wrong: "Adding more processors doesn't affect the test time much."
[OK] Correct: Each processor runs on every document, so more processors multiply the work and increase time.
Understanding how pipeline testing time grows helps you design efficient tests and pipelines, a useful skill in real projects.
"What if the pipeline had nested processors or conditional steps? How would that affect the time complexity?"