Runtime fields in Elasticsearch - Time & Space Complexity
When using runtime fields in Elasticsearch, it's important to understand how the time to process data changes as the amount of data grows.
We want to know how the cost of computing these fields scales with the number of documents.
Analyze the time complexity of the following runtime field script.
GET my-index/_search
{
"runtime_mappings": {
"full_name": {
"type": "keyword",
"script": {
"source": "emit(doc['first_name'].value + ' ' + doc['last_name'].value)"
}
}
},
"query": { "match_all": {} }
}
This code defines a runtime field that combines first and last names for each document during search.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The script runs once for each document matched by the query.
- How many times: Equal to the number of documents returned by the query.
As the number of documents increases, the script runs more times, once per document.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 script executions |
| 100 | 100 script executions |
| 1000 | 1000 script executions |
Pattern observation: The work grows directly with the number of documents processed.
Time Complexity: O(n)
This means the time to compute runtime fields grows linearly with the number of documents.
[X] Wrong: "Runtime fields are computed once and reused, so their cost is constant regardless of data size."
[OK] Correct: Runtime fields are computed on the fly for each document during search, so the cost increases with more documents.
Understanding how runtime fields affect search performance shows you can reason about costs in real systems, a valuable skill for building efficient search solutions.
"What if the runtime field script included a loop over an array field inside each document? How would the time complexity change?"