NULL and NA values in R Programming - Time & Space Complexity
We want to understand how checking for NULL and NA values in R grows as the data size increases.
How does the time to find these special values change when we have more data?
Analyze the time complexity of the following code snippet.
# Check for NA values in a vector
check_na <- function(vec) {
result <- logical(length(vec))
for (i in seq_along(vec)) {
result[i] <- is.na(vec[i])
}
return(result)
}
This code goes through each element of a vector and checks if it is NA, storing TRUE or FALSE in a result vector.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The for-loop that checks each element with
is.na(). - How many times: Once for every element in the input vector.
As the vector gets bigger, the number of checks grows in the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The number of operations grows directly with the size of the input.
Time Complexity: O(n)
This means the time to check for NA values grows in a straight line as the input size grows.
[X] Wrong: "Checking for NA values is instant no matter how big the data is."
[OK] Correct: Each element must be checked one by one, so more data means more work.
Understanding how simple checks scale helps you explain performance in real data tasks clearly and confidently.
"What if we used a built-in vectorized function instead of a loop? How would the time complexity change?"