Dictionary-based CSV handling in Python - Time & Space Complexity
When working with CSV files using dictionaries, it's important to know how the time to process data grows as the file gets bigger.
We want to understand how the program's speed changes when reading and handling rows in a CSV file.
Analyze the time complexity of the following code snippet.
import csv
def read_csv_to_dict(filename):
with open(filename, mode='r', newline='') as file:
reader = csv.DictReader(file)
rows = []
for row in reader:
rows.append(row)
return rows
This code reads a CSV file and stores each row as a dictionary in a list.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through each row in the CSV file.
- How many times: Once for every row in the file (n times).
As the number of rows in the CSV file increases, the time to read and store them grows in a straight line.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 loops to read rows |
| 100 | About 100 loops to read rows |
| 1000 | About 1000 loops to read rows |
Pattern observation: The work grows evenly with the number of rows; doubling rows doubles the work.
Time Complexity: O(n)
This means the time to read the CSV grows directly with the number of rows in the file.
[X] Wrong: "Reading a CSV with dictionaries is slower because dictionaries are complex."
[OK] Correct: The main time cost is reading each row once, not the dictionary creation, so the time grows linearly regardless.
Understanding how reading data scales helps you write programs that handle files efficiently and shows you can think about performance clearly.
"What if we added a nested loop to compare each row with every other row? How would the time complexity change?"