Ascending and descending order in Pandas - Time & Space Complexity
When we sort data in pandas, it takes some time depending on how much data there is.
We want to know how the time needed changes as the data grows.
Analyze the time complexity of the following code snippet.
import pandas as pd
df = pd.DataFrame({
'numbers': [5, 2, 9, 1, 7]
})
sorted_df = df.sort_values(by='numbers', ascending=True)
This code sorts a column of numbers in ascending order using pandas.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The sorting algorithm compares and rearranges elements.
- How many times: It repeats comparisons many times depending on the number of rows.
As the number of rows grows, the sorting work grows faster than just adding more rows.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 30 to 40 comparisons |
| 100 | About 700 to 1000 comparisons |
| 1000 | About 10,000 to 15,000 comparisons |
Pattern observation: The work grows faster than the number of rows, roughly like n log n.
Time Complexity: O(n log n)
This means if you double the data size, the time needed grows a bit more than double but much less than square.
[X] Wrong: "Sorting takes the same time no matter how many rows there are."
[OK] Correct: Sorting compares many pairs of rows, so more rows mean more comparisons and more time.
Understanding how sorting time grows helps you explain your code choices clearly and shows you know how data size affects performance.
"What if we sorted by multiple columns instead of one? How would the time complexity change?"