Filling missing values (fillna) in Data Analysis Python - Time & Space Complexity
When we fill missing values in data, we want to know how long it takes as the data grows.
We ask: How does the time to fill missing spots change when the data gets bigger?
Analyze the time complexity of the following code snippet.
import pandas as pd
data = pd.Series([1, None, 3, None, 5])
filled_data = data.fillna(0)
This code replaces missing values in a list of numbers with zero.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Checking each element in the data to see if it is missing.
- How many times: Once for every item in the data list.
As the data list grows, the time to fill missing values grows in a straight line.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks and fills |
| 100 | 100 checks and fills |
| 1000 | 1000 checks and fills |
Pattern observation: Doubling the data doubles the work needed.
Time Complexity: O(n)
This means the time to fill missing values grows directly with the size of the data.
[X] Wrong: "Filling missing values happens instantly no matter how big the data is."
[OK] Correct: The code must check each item to find missing spots, so bigger data takes more time.
Understanding how filling missing data scales helps you explain data cleaning steps clearly and confidently.
"What if we filled missing values using the previous value instead of a fixed number? How would the time complexity change?"