InfluxDB for time-series data in Raspberry Pi - Time & Space Complexity
When working with InfluxDB on a Raspberry Pi, it's important to understand how the time it takes to store and query data grows as you add more time-series entries.
We want to know how the number of operations changes when the amount of data increases.
Analyze the time complexity of the following code snippet.
from influxdb_client import InfluxDBClient
client = InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org")
query_api = client.query_api()
query = 'from(bucket:"sensor_data") |> range(start: -1h) |> filter(fn: (r) => r._measurement == "temperature")'
result = query_api.query(org="my-org", query=query)
for table in result:
for record in table.records:
print(record.get_value())
This code queries temperature data from the last hour and prints each value.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through all records returned by the query.
- How many times: Once for each data point in the last hour matching the filter.
As the number of temperature records in the last hour grows, the number of print operations grows the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 print operations |
| 100 | 100 print operations |
| 1000 | 1000 print operations |
Pattern observation: The number of operations grows directly with the number of data points.
Time Complexity: O(n)
This means the time to process and print data grows in a straight line with the number of records.
[X] Wrong: "Querying InfluxDB always takes the same time no matter how much data there is."
[OK] Correct: The query time depends on how many records match and are returned; more data means more processing time.
Understanding how data size affects query time in databases like InfluxDB shows you can reason about performance in real projects, a valuable skill for any developer.
"What if we added an index on the _measurement field? How would the time complexity of the query change?"