Geo-point and geo-shape types in Elasticsearch - Time & Space Complexity
When working with geo-point and geo-shape types in Elasticsearch, it is important to understand how query time grows as the number of locations or shapes increases.
We want to know how the search time changes when we add more geographic data.
Analyze the time complexity of the following Elasticsearch geo query.
GET /places/_search
{
"query": {
"geo_shape": {
"location": {
"shape": {
"type": "polygon",
"coordinates": [[[30, 10], [40, 40], [20, 40], [10, 20], [30, 10]]]
},
"relation": "within"
}
}
}
}
This query finds documents where the geo-shape field "location" is within the given polygon.
In this geo-shape query, Elasticsearch checks each document's shape against the query polygon.
- Primary operation: Checking spatial relation between shapes.
- How many times: Once per document in the index.
As the number of documents with geo-shapes grows, the number of shape comparisons grows roughly the same.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 shape checks |
| 100 | 100 shape checks |
| 1000 | 1000 shape checks |
Pattern observation: The work grows linearly as more geo-shapes are stored.
Time Complexity: O(n)
This means the query time grows in direct proportion to the number of geo-shape documents.
[X] Wrong: "Geo queries always run instantly no matter how many shapes there are."
[OK] Correct: Each shape must be checked, so more shapes mean more work and longer query times.
Understanding how geo queries scale helps you explain performance in real projects and shows you can think about data size effects clearly.
What if we added a spatial index like an R-tree? How would the time complexity change?