Reading and writing CSV data in Python - Time & Space Complexity
When working with CSV files, it's important to know how the time to read or write data changes as the file grows.
We want to understand how the program's speed changes when the number of rows in the CSV changes.
Analyze the time complexity of the following code snippet.
import csv
def read_csv(filename):
with open(filename, newline='') as file:
reader = csv.reader(file)
data = []
for row in reader:
data.append(row)
return data
# This function reads all rows from a CSV file into a list
This code reads each row from a CSV file and stores it in a list.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through each row in the CSV file.
- How many times: Once for every row in the file (n times).
As the number of rows increases, the time to read grows in a straight line.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 row reads |
| 100 | About 100 row reads |
| 1000 | About 1000 row reads |
Pattern observation: Doubling the rows roughly doubles the work done.
Time Complexity: O(n)
This means the time to read the CSV grows directly with the number of rows.
[X] Wrong: "Reading a CSV file always takes the same time no matter how big it is."
[OK] Correct: The program reads each row one by one, so more rows mean more time.
Understanding how file reading time grows helps you write efficient data processing code and explain your reasoning clearly.
"What if we read the CSV file but only processed every other row? How would the time complexity change?"