head() and tail() for previewing in Pandas - Time & Space Complexity
When we use head() or tail() in pandas, we want to quickly see a small part of our data.
We ask: How does the time to get this preview change as the data gets bigger?
Analyze the time complexity of the following code snippet.
import pandas as pd
df = pd.DataFrame({
'A': range(1000000),
'B': range(1000000, 2000000)
})
preview_top = df.head(5)
preview_bottom = df.tail(5)
This code creates a large table and then shows the first 5 and last 5 rows.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Selecting a fixed number of rows from the start or end.
- How many times: Exactly 5 rows are accessed each time, no matter the total size.
Getting 5 rows from the start or end takes the same effort whether the table has 10 or 1,000,000 rows.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 5 |
| 100 | 5 |
| 1000 | 5 |
Pattern observation: The number of operations stays the same, fixed by the number of rows requested.
Time Complexity: O(1)
This means the time to preview rows does not grow as the data gets bigger; it stays constant.
[X] Wrong: "Getting the first or last rows takes longer if the table is huge."
[OK] Correct: Because pandas directly accesses only the requested rows, it does not scan the whole table.
Understanding that previewing data is quick and does not depend on total size helps you explain efficient data handling in real projects.
What if we changed head(5) to head(n) where n grows with the data size? How would the time complexity change?