0
0
PythonHow-ToBeginner · 3 min read

How to Split String Using Regex in Python: Simple Guide

Use the re.split() function from Python's re module to split a string by a regex pattern. Pass the regex pattern and the string to re.split(), and it returns a list of substrings split where the pattern matches.
📐

Syntax

The basic syntax to split a string using regex in Python is:

  • re.split(pattern, string, maxsplit=0, flags=0)

Where:

  • pattern is the regex pattern to split on.
  • string is the input string to split.
  • maxsplit (optional) limits the number of splits; 0 means no limit.
  • flags (optional) modify regex behavior (like case-insensitive).
python
import re

result = re.split(r'\W+', 'Hello, world! Welcome to Python.')
print(result)
Output
['Hello', 'world', 'Welcome', 'to', 'Python']
💻

Example

This example shows how to split a sentence into words by using any non-word character as the separator.

python
import re

text = 'Hello, world! Welcome to Python.'
words = re.split(r'\W+', text)
print(words)
Output
['Hello', 'world', 'Welcome', 'to', 'Python']
⚠️

Common Pitfalls

One common mistake is using str.split() when you need regex splitting, which only splits by fixed strings. Another is forgetting that re.split() can include empty strings if the pattern matches at the start or end.

Also, if your regex pattern contains capturing groups (parentheses), the matched separators are included in the result list.

python
import re

# Wrong: using str.split() when regex needed
text = 'apple, banana; orange'
print(text.split(','))  # Only splits by comma

# Right: using re.split() to split by comma or semicolon
print(re.split(r'[;,]\s*', text))

# Capturing group example
print(re.split(r'(,|;)', text))  # Includes separators in output
Output
['apple', ' banana; orange'] ['apple', 'banana', 'orange'] ['apple', ',', ' banana', ';', ' orange']
📊

Quick Reference

Tips for using re.split():

  • Use raw strings (prefix r) for regex patterns to avoid escaping issues.
  • Set maxsplit to limit splits if needed.
  • Use flags=re.IGNORECASE for case-insensitive splitting.
  • Remember capturing groups include separators in output.

Key Takeaways

Use re.split() to split strings by regex patterns in Python.
Pass the regex pattern and string to re.split() to get a list of parts.
Beware that capturing groups in the pattern include separators in the result.
Use raw strings (r'pattern') to write regex patterns safely.
Set maxsplit to control how many splits happen.