Querying historical data in SCADA systems - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When we ask a SCADA system for past data, the time it takes depends on how much data we want. We want to understand how this time grows as we ask for more history.
The question is: How does the system's work increase when we query more historical records?
Analyze the time complexity of the following code snippet.
// Query historical data for a sensor
function queryHistoricalData(sensorId, startTime, endTime) {
let data = [];
let records = database.getRecords(sensorId, startTime, endTime);
for (let record of records) {
data.push(process(record));
}
return data;
}
This code fetches all records for a sensor between two times and processes each record one by one.
- Primary operation: Looping through each historical record returned from the database.
- How many times: Once for every record between startTime and endTime.
As the time range grows, more records are returned and processed, so the work grows with the number of records.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 records | About 10 processing steps |
| 100 records | About 100 processing steps |
| 1000 records | About 1000 processing steps |
Pattern observation: The work grows directly with the number of records requested.
Time Complexity: O(n)
This means the time to get and process data grows in a straight line with how many records we ask for.
[X] Wrong: "Querying more time means the system takes the same time because it just asks once."
[OK] Correct: Actually, the system must handle each record returned, so more records mean more work and more time.
Understanding how data queries scale helps you explain system behavior clearly. This skill shows you can think about real-world system limits and performance.
"What if the system cached recent records? How would that change the time complexity when querying recent data?"
Practice
Solution
Step 1: Understand the role of historical data
Historical data stores past readings and events from the system.Step 2: Identify the purpose of querying it
Querying helps analyze past behavior and detect trends or issues.Final Answer:
To review past system behavior and analyze trends -> Option BQuick Check:
Historical data = review past behavior [OK]
- Confusing historical data with real-time control
- Thinking it updates devices
- Assuming it changes network settings
Solution
Step 1: Check correct SQL syntax for conditions
Use single equals (=) for comparison and AND to combine conditions.Step 2: Verify logical conditions match requirements
SELECT * FROM readings WHERE sensor_id = 'S1' AND timestamp > '2024-01-01' AND type = 'temperature'; correctly filters sensor_id = 'S1', timestamp > '2024-01-01', and type = 'temperature'.Final Answer:
SELECT * FROM readings WHERE sensor_id = 'S1' AND timestamp > '2024-01-01' AND type = 'temperature'; -> Option DQuick Check:
Correct syntax and filters = SELECT * FROM readings WHERE sensor_id = 'S1' AND timestamp > '2024-01-01' AND type = 'temperature'; [OK]
- Using '==' instead of '=' in SQL
- Mixing AND and OR without parentheses
- Using wrong comparison operators
- Filtering with wrong timestamp direction
SELECT timestamp, value FROM readings WHERE sensor_id = 'S2' AND timestamp BETWEEN '2024-03-01' AND '2024-03-05' ORDER BY timestamp DESC LIMIT 3;What will be the output?
Solution
Step 1: Understand the WHERE and BETWEEN clause
Filters readings from sensor 'S2' between '2024-03-01' and '2024-03-05'.Step 2: Analyze ORDER BY and LIMIT
ORDER BY timestamp DESC sorts newest first; LIMIT 3 returns top 3 newest readings.Final Answer:
The 3 latest readings from sensor S2 between March 1 and 5, sorted descending -> Option AQuick Check:
ORDER BY DESC + LIMIT 3 = latest 3 readings [OK]
- Confusing ascending vs descending order
- Thinking LIMIT returns earliest records
- Assuming LIMIT causes syntax error with ORDER BY
SELECT * FROM readings WHERE sensor_id = 'P1' AND timestamp > '2024-02-01' AND type = 'pressure'But it returns no results, even though data exists. What is the likely problem?
Solution
Step 1: Check timestamp format correctness
Timestamp format must match stored data format exactly to filter correctly.Step 2: Verify other query parts
Sensor_id as string is valid; GROUP BY not needed; type column likely exists.Final Answer:
The timestamp format is incorrect and does not match stored data -> Option CQuick Check:
Timestamp format mismatch = no results [OK]
- Assuming sensor_id must be numeric
- Adding unnecessary GROUP BY
- Ignoring timestamp format differences
Solution
Step 1: Filter sensors correctly
Use (sensor_id = 'T1' OR sensor_id = 'T2') or sensor_id IN ('T1', 'T2') to include both sensors.Step 2: Apply date and value filters with grouping
Filter timestamp between January 1 and 31, value > 20, and type = 'temperature'. Group by sensor_id to get averages per sensor.Step 3: Check query correctness
SELECT sensor_id, AVG(value) FROM readings WHERE (sensor_id = 'T1' OR sensor_id = 'T2') AND timestamp BETWEEN '2024-01-01' AND '2024-01-31' AND value > 20 AND type = 'temperature' GROUP BY sensor_id; uses correct syntax with parentheses and GROUP BY; SELECT AVG(value) FROM readings WHERE sensor_id IN ('T1', 'T2') AND timestamp >= '2024-01-01' AND timestamp <= '2024-01-31' AND value > 20 AND type = 'temperature'; misses GROUP BY; SELECT sensor_id, AVG(value) FROM readings WHERE sensor_id = 'T1' AND sensor_id = 'T2' AND timestamp BETWEEN '2024-01-01' AND '2024-01-31' AND value > 20 AND type = 'temperature' GROUP BY sensor_id; has impossible condition; SELECT sensor_id, AVG(value) FROM readings WHERE sensor_id = 'T1' OR sensor_id = 'T2' AND timestamp BETWEEN '2024-01-01' AND '2024-01-31' AND value > 20 AND type = 'temperature'; lacks parentheses causing wrong logic.Final Answer:
SELECT sensor_id, AVG(value) FROM readings WHERE (sensor_id = 'T1' OR sensor_id = 'T2') AND timestamp BETWEEN '2024-01-01' AND '2024-01-31' AND value > 20 AND type = 'temperature' GROUP BY sensor_id; -> Option AQuick Check:
Correct filters + grouping = SELECT sensor_id, AVG(value) FROM readings WHERE (sensor_id = 'T1' OR sensor_id = 'T2') AND timestamp BETWEEN '2024-01-01' AND '2024-01-31' AND value > 20 AND type = 'temperature' GROUP BY sensor_id; [OK]
- Missing GROUP BY when aggregating by sensor
- Using AND instead of OR between sensor_ids
- Incorrect timestamp filtering logic
