Reading and writing CSV data in Python - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When working with CSV files, it's important to know how the time to read or write data changes as the file grows.
We want to understand how the program's speed changes when the number of rows in the CSV changes.
Analyze the time complexity of the following code snippet.
import csv
def read_csv(filename):
with open(filename, newline='') as file:
reader = csv.reader(file)
data = []
for row in reader:
data.append(row)
return data
# This function reads all rows from a CSV file into a list
This code reads each row from a CSV file and stores it in a list.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through each row in the CSV file.
- How many times: Once for every row in the file (n times).
As the number of rows increases, the time to read grows in a straight line.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 row reads |
| 100 | About 100 row reads |
| 1000 | About 1000 row reads |
Pattern observation: Doubling the rows roughly doubles the work done.
Time Complexity: O(n)
This means the time to read the CSV grows directly with the number of rows.
[X] Wrong: "Reading a CSV file always takes the same time no matter how big it is."
[OK] Correct: The program reads each row one by one, so more rows mean more time.
Understanding how file reading time grows helps you write efficient data processing code and explain your reasoning clearly.
"What if we read the CSV file but only processed every other row? How would the time complexity change?"
Practice
csv.reader function do when reading a CSV file?Solution
Step 1: Understand csv.reader purpose
Thecsv.readerreads CSV files and returns each row as a list of strings representing columns.Step 2: Differentiate from other functions
Functions likecsv.DictReaderreturn dictionaries, and writing functions save data, not read it.Final Answer:
It reads the file and returns each row as a list of strings. -> Option BQuick Check:
csv.reader returns lists [OK]
- Confusing csv.reader with csv.DictReader
- Thinking csv.reader writes data
- Assuming it deletes or modifies files
Solution
Step 1: Identify mode for writing CSV
To write CSV files, open the file in write mode 'w' and usenewline=''to prevent extra blank lines on Windows.Step 2: Check other options
'r' is read mode, 'a' is append (valid but not asked), 'rb' is binary read mode (not for writing text CSV).Final Answer:
open('file.csv', 'w', newline='') -> Option AQuick Check:
Write mode with newline='' [OK]
- Forgetting newline='' causes blank lines
- Using 'r' mode when writing
- Using binary mode for text CSV
import csv
with open('data.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(['Name', 'Age'])
writer.writerow(['Alice', 30])
with open('data.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
print(row)Solution
Step 1: Writing rows with csv.writer
The code writes two rows: header ['Name', 'Age'] and data ['Alice', 30]. Numbers are converted to strings when written.Step 2: Reading rows with csv.reader and printing
Reading returns each row as a list of strings. print(row) shows repr with quotes: ['Name', 'Age'] and ['Alice', '30'].Final Answer:
['Name', 'Age']\n['Alice', '30'] -> Option CQuick Check:
csv.writer writes lists, csv.reader reads lists [OK]
- Expecting printed rows as comma strings
- Confusing string and integer types in output
- Assuming encoding error without cause
import csv
with open('file.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
print(row)Solution
Step 1: Check code indentation
The print statement inside the for loop must be indented to be part of the loop body.Step 2: Verify other parts
Import is present, file mode 'r' is correct for reading, and newline argument is not needed for reading.Final Answer:
Indentation error in the for loop -> Option AQuick Check:
Python requires correct indentation [OK]
- Forgetting to indent inside loops
- Thinking newline is needed for reading
- Confusing file modes
Solution
Step 1: Choose reading method with headers
csv.DictReader reads CSV rows as dictionaries using the first row as keys, making it easy to filter by column names like 'Age'.Step 2: Filter and write with headers
Filter rows where 'Age' > 25, then write using csv.DictWriter with fieldnames to include headers properly.Final Answer:
Use csv.DictReader to read rows as dictionaries, filter by 'Age' key, then write with csv.DictWriter including header. -> Option DQuick Check:
DictReader + DictWriter for header and filtering [OK]
- Skipping header manually instead of using DictReader
- Writing without headers causing missing columns
- Using csv.writer without filtering logic
