0
0
R Programmingprogramming~5 mins

sub and gsub in R Programming - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: sub and gsub
O(n)
Understanding Time Complexity

We want to understand how the time it takes to replace text grows when using sub and gsub in R.

How does the work change as the text gets longer or has more matches?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

text <- c("apple apple", "banana apple", "apple banana apple")
result_sub <- sub("apple", "orange", text)
result_gsub <- gsub("apple", "orange", text)

This code replaces the first occurrence of "apple" with "orange" using sub, and all occurrences using gsub, in a vector of strings.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Scanning each string to find matches of the pattern.
  • How many times: For sub, scanning stops after the first match per string; for gsub, scanning continues through the entire string to find all matches.
How Execution Grows With Input

As the length of each string or the number of strings grows, the work increases.

Input Size (n)Approx. Operations
10 (short strings)About 10 scans for sub, 10 full scans for gsub
100 (longer strings)About 100 scans for sub, 100 full scans for gsub
1000 (many long strings)About 1000 scans for sub, 1000 full scans for gsub

Pattern observation: The time grows roughly in direct proportion to the total length of all strings. gsub does more work per string because it finds all matches, not just the first.

Final Time Complexity

Time Complexity: O(n)

This means the time to replace text grows linearly with the total size of the input strings.

Common Mistake

[X] Wrong: "sub and gsub always take the same time because they do similar replacements."

[OK] Correct: sub stops after the first match in each string, so it often does less work than gsub, which finds and replaces all matches.

Interview Connect

Understanding how string replacement scales helps you write efficient code when working with text data, a common task in many projects.

Self-Check

"What if we changed the pattern to a more complex regular expression? How would the time complexity change?"