melt() for unpivoting in Pandas - Time & Space Complexity
When we use pandas melt() to reshape data, it is important to know how the time needed grows as the data gets bigger.
We want to understand how the work done by melt() changes when the table has more rows or columns.
Analyze the time complexity of the following code snippet.
import pandas as pd
df = pd.DataFrame({
'ID': [1, 2, 3],
'Math': [90, 80, 70],
'Science': [85, 75, 65],
'English': [88, 78, 68]
})
melted = pd.melt(df, id_vars=['ID'], var_name='Subject', value_name='Score')
This code takes a table with scores in columns and turns it into a longer table with one score per row.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: pandas goes through each row and each column to rearrange data.
- How many times: It visits every cell in the columns being unpivoted once.
As the number of rows or columns grows, the work grows too because melt() looks at each cell in the selected columns.
| Input Size (rows x columns) | Approx. Operations |
|---|---|
| 10 x 3 | About 30 |
| 100 x 3 | About 300 |
| 1000 x 3 | About 3000 |
Pattern observation: The operations grow roughly in direct proportion to the number of cells melted.
Time Complexity: O(n x m)
This means the time grows linearly with the number of rows (n) and columns (m) being unpivoted.
[X] Wrong: "melt() only depends on the number of rows, so columns don't affect time."
[OK] Correct: melt() processes every cell in the columns being unpivoted, so more columns mean more work.
Understanding how data reshaping scales helps you explain your choices clearly and shows you know what happens behind the scenes.
"What if we melt only one column instead of many? How would the time complexity change?"