Reading from SQL databases in Data Analysis Python - Time & Space Complexity
When we read data from SQL databases, we want to know how the time it takes changes as the data grows.
We ask: How does reading more rows affect the time needed?
Analyze the time complexity of the following code snippet.
import sqlite3
import pandas as pd
conn = sqlite3.connect('example.db')
query = 'SELECT * FROM sales'
df = pd.read_sql_query(query, conn)
conn.close()
This code connects to a database, runs a query to get all rows from the 'sales' table, and loads the data into a DataFrame.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Reading each row from the database result set.
- How many times: Once for every row in the 'sales' table.
As the number of rows grows, the time to read all rows grows roughly the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 reads |
| 100 | 100 reads |
| 1000 | 1000 reads |
Pattern observation: The time grows in a straight line with the number of rows.
Time Complexity: O(n)
This means the time to read data grows directly with the number of rows you want to get.
[X] Wrong: "Reading from a database is always instant, no matter how much data there is."
[OK] Correct: Reading more rows means more data to transfer and process, so it takes more time.
Understanding how data size affects reading time helps you explain performance in real projects and shows you think about efficiency.
"What if we added a filter to the SQL query to only read a small subset of rows? How would the time complexity change?"