pivot_wider (long to wide) in R Programming - Time & Space Complexity
When we use pivot_wider in R, we change data from a long format to a wide format.
We want to understand how the time it takes grows as the data gets bigger.
Analyze the time complexity of the following code snippet.
library(tidyr)
data_long <- data.frame(
id = rep(1:1000, each = 3),
key = rep(c('A', 'B', 'C'), times = 1000),
value = rnorm(3000)
)
data_wide <- pivot_wider(data_long, names_from = key, values_from = value)
This code takes a long table with repeated ids and keys, and makes it wide by spreading keys into columns.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning all rows to group by
idand spreadkeyvalues. - How many times: Each of the
nrows is processed once to place values in the wide format.
As the number of rows grows, the work grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 operations |
| 100 | About 100 operations |
| 1000 | About 1000 operations |
Pattern observation: Doubling the input roughly doubles the work needed.
Time Complexity: O(n)
This means the time to pivot grows linearly with the number of rows in the data.
[X] Wrong: "Pivoting data is always slow because it rearranges everything multiple times."
[OK] Correct: Actually, pivot_wider processes each row once, so the time grows steadily, not wildly.
Understanding how data reshaping scales helps you work efficiently with real datasets and shows you can reason about performance.
"What if the number of unique keys grows with the input size? How would the time complexity change?"