Factor creation in R Programming - Time & Space Complexity
We want to see how the time needed to create a factor changes as the input grows.
How does making a factor from a vector take more or less time when the vector gets bigger?
Analyze the time complexity of the following code snippet.
# Create a factor from a character vector
vec <- rep(c("apple", "banana", "cherry"), length.out = n)
fact <- factor(vec)
This code makes a factor from a vector that repeats three fruit names many times.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning the entire vector to assign levels and codes.
- How many times: Once for each element in the vector (n times).
As the vector gets longer, the time to create the factor grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 scans |
| 100 | About 100 scans |
| 1000 | About 1000 scans |
Pattern observation: Doubling the input roughly doubles the work needed.
Time Complexity: O(n)
This means the time to create a factor grows linearly with the size of the input vector.
[X] Wrong: "Creating a factor is instant no matter how big the vector is."
[OK] Correct: The function must look at each element to assign levels, so bigger vectors take more time.
Understanding how factor creation scales helps you reason about data processing speed in R, a useful skill in many coding tasks.
"What if we created a factor from a vector with many unique values instead of just three? How would the time complexity change?"