How to Replace Text in a String in R: Simple Guide
In R, you can replace text in a string using
gsub() for global replacement or sub() for the first match only. Both functions take the pattern to find, the replacement text, and the original string as arguments.Syntax
The basic syntax for replacing text in a string in R uses gsub() or sub() functions:
gsub(pattern, replacement, x): Replaces all occurrences ofpatterninxwithreplacement.sub(pattern, replacement, x): Replaces only the first occurrence ofpatterninxwithreplacement.
Here, pattern is the text or regular expression to find, replacement is the new text, and x is the original string.
r
gsub(pattern, replacement, x) sub(pattern, replacement, x)
Example
This example shows how to replace all occurrences of "cat" with "dog" in a string using gsub(), and how sub() replaces only the first occurrence.
r
text <- "The cat sat on the cat mat." all_replaced <- gsub("cat", "dog", text) first_replaced <- sub("cat", "dog", text) print(all_replaced) print(first_replaced)
Output
[1] "The dog sat on the dog mat."
[1] "The dog sat on the cat mat."
Common Pitfalls
One common mistake is confusing sub() and gsub(). sub() replaces only the first match, so if you want to replace all matches, use gsub(). Another pitfall is forgetting that pattern can be a regular expression, so special characters need to be escaped.
r
wrong <- sub("cat", "dog", "cat cat cat") # replaces only first right <- gsub("cat", "dog", "cat cat cat") # replaces all print(wrong) print(right)
Output
[1] "dog cat cat"
[1] "dog dog dog"
Quick Reference
| Function | Description | Replaces |
|---|---|---|
| sub() | Replaces first match only | First occurrence |
| gsub() | Replaces all matches | All occurrences |
Key Takeaways
Use gsub() to replace all occurrences of a pattern in a string.
Use sub() to replace only the first occurrence of a pattern.
Patterns can be regular expressions; escape special characters if needed.
Remember that sub() and gsub() return a new string; they do not change the original.
Test your replacement on sample strings to avoid unexpected results.