Time-series collections in MongoDB - Time & Space Complexity
When working with time-series collections in MongoDB, it is important to understand how the time to run queries grows as the amount of data increases.
We want to know how the database handles many time-stamped records efficiently.
Analyze the time complexity of the following MongoDB query on a time-series collection.
db.sensorData.find({
timestamp: { $gte: ISODate("2024-01-01T00:00:00Z"), $lt: ISODate("2024-01-02T00:00:00Z") }
})
.sort({ timestamp: 1 })
.limit(100)
This query fetches up to 100 records from a time-series collection for one day, sorted by time.
Look at what repeats as the data grows:
- Primary operation: Scanning the time index to find matching records.
- How many times: The database scans only the relevant time range, not the whole collection.
As the total data grows, the query only looks at the time range requested.
| Input Size (n) | Approx. Operations |
|---|---|
| 10,000 records total | Operations proportional to records in one day |
| 100,000 records total | Still proportional to one day's records, not total |
| 1,000,000 records total | Same as above, depends on day's data size |
Pattern observation: The query cost grows with the size of the time range, not the total data size.
Time Complexity: O(k)
This means the query time grows linearly with the number of records in the requested time range, not the whole collection.
[X] Wrong: "Query time grows with total data size in the collection."
[OK] Correct: Time-series collections use indexes on time, so queries only scan relevant time slices, not all data.
Understanding how time-series collections handle queries helps you explain efficient data retrieval in real-world apps that track events over time.
What if we removed the time range filter and queried without it? How would the time complexity change?