Extracting year, month, day in Pandas - Time & Space Complexity
We want to know how the time to extract year, month, and day from dates changes as we have more data.
How does the work grow when the list of dates gets bigger?
Analyze the time complexity of the following code snippet.
import pandas as pd
dates = pd.Series(pd.date_range(start='2020-01-01', periods=1000))
years = dates.dt.year
months = dates.dt.month
days = dates.dt.day
This code creates 1000 dates and extracts the year, month, and day from each date.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Extracting year, month, and day from each date in the series.
- How many times: Once for each date in the list (n times).
As the number of dates grows, the work to extract parts grows in the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 extractions |
| 100 | About 100 extractions |
| 1000 | About 1000 extractions |
Pattern observation: The work grows directly with the number of dates.
Time Complexity: O(n)
This means the time to extract year, month, and day grows linearly as the number of dates increases.
[X] Wrong: "Extracting year, month, and day is constant time no matter how many dates there are."
[OK] Correct: Each date needs to be processed separately, so more dates mean more work.
Understanding how operations grow with data size helps you explain your code choices clearly and confidently.
"What if we extracted only the year instead of year, month, and day? How would the time complexity change?"