Apply family vs loops in R Programming - Performance Comparison
We want to understand how using the apply family functions compares to loops in R when it comes to running time.
How does the number of steps grow as the input data gets bigger?
Analyze the time complexity of the following code snippet.
# Using a loop to sum each row of a matrix
result_loop <- numeric(nrow(mat))
for(i in 1:nrow(mat)) {
result_loop[i] <- sum(mat[i, ])
}
# Using apply to sum each row of a matrix
result_apply <- apply(mat, 1, sum)
This code sums each row of a matrix using a loop and then using the apply function.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Summing elements of each row in the matrix.
- How many times: Once for each row, so as many times as the number of rows (n).
As the number of rows increases, the total work grows proportionally because each row is processed once.
| Input Size (n rows) | Approx. Operations |
|---|---|
| 10 | 10 sums of row elements |
| 100 | 100 sums of row elements |
| 1000 | 1000 sums of row elements |
Pattern observation: The number of operations grows linearly with the number of rows.
Time Complexity: O(n * m)
This means the time grows roughly with the number of rows times the number of columns, since each element is visited once.
[X] Wrong: "apply is always faster than loops because it is a special function."
[OK] Correct: Both apply and loops do similar work under the hood; speed depends on implementation and data size, not just the function name.
Understanding how loops and apply functions scale helps you write clear and efficient R code, a useful skill in many data tasks.
What if we used lapply on a list of vectors instead of apply on a matrix? How would the time complexity change?