How to Use re Module in Python: Syntax, Examples, and Tips
Use the
re module in Python to work with regular expressions for searching, matching, and manipulating text. Import it with import re, then use functions like re.search(), re.match(), and re.findall() to find patterns in strings.Syntax
The re module provides functions to work with regular expressions. Key functions include:
re.search(pattern, string): Finds the first match ofpatternanywhere instring.re.match(pattern, string): Checks ifstringstarts withpattern.re.findall(pattern, string): Returns all non-overlapping matches ofpatterninstringas a list.re.sub(pattern, repl, string): Replaces matches ofpatternwithreplinstring.
Patterns are regular expressions written as strings.
python
import re # Basic usage examples pattern = r"\bword\b" text = "This is a word in a sentence." # Search for pattern match = re.search(pattern, text) # Match at start start_match = re.match(pattern, text) # Find all matches all_matches = re.findall(pattern, text) # Replace pattern replaced_text = re.sub(pattern, "replacement", text)
Example
This example shows how to find all words starting with 'a' in a sentence using re.findall() and how to replace them with 'X' using re.sub().
python
import re text = "An apple a day keeps the doctor away." pattern = r"\ba\w*\b" # words starting with 'a' # Find all words starting with 'a' words_starting_with_a = re.findall(pattern, text, re.IGNORECASE) print("Words starting with 'a':", words_starting_with_a) # Replace words starting with 'a' with 'X' replaced_text = re.sub(pattern, "X", text, flags=re.IGNORECASE) print("Replaced text:", replaced_text)
Output
Words starting with 'a': ['An', 'apple', 'a', 'away']
Replaced text: X X X day keeps the doctor X.
Common Pitfalls
Common mistakes when using re include:
- Forgetting to use raw strings (prefix
r) for patterns, which can cause errors with backslashes. - Using
re.match()expecting it to find matches anywhere; it only checks the start of the string. - Not using flags like
re.IGNORECASEwhen case-insensitive matching is needed. - Confusing greedy and non-greedy matching, which affects how much text is matched.
python
import re text = "Hello hello" # Wrong: pattern without raw string pattern_wrong = "\bhello\b" # This will cause an error or unexpected behavior # Right: use raw string pattern_right = r"\bhello\b" # Wrong: re.match won't find 'hello' if not at start match_wrong = re.match(pattern_right, text) # None # Right: use re.search to find anywhere match_right = re.search(pattern_right, text) print("Match with re.match:", match_wrong) print("Match with re.search:", match_right.group() if match_right else None)
Output
Match with re.match: None
Match with re.search: Hello
Quick Reference
| Function | Purpose | Example Usage |
|---|---|---|
| re.search() | Finds first match anywhere in string | re.search(r'cat', text) |
| re.match() | Checks match only at start of string | re.match(r'cat', text) |
| re.findall() | Returns list of all matches | re.findall(r'cat', text) |
| re.sub() | Replaces matches with new text | re.sub(r'cat', 'dog', text) |
| re.compile() | Prepares regex pattern for reuse | pattern = re.compile(r'cat') |
Key Takeaways
Always import the re module with import re before using regex functions.
Use raw strings (r"pattern") to write regex patterns safely without errors.
re.search() finds a pattern anywhere; re.match() only at the start of the string.
Use re.findall() to get all matches as a list and re.sub() to replace matches.
Remember to use flags like re.IGNORECASE for case-insensitive matching.