0
0
PythonHow-ToBeginner · 3 min read

Remove Special Characters from String in Python Easily

To remove special characters from a string in Python, use the re.sub() function from the re module with a pattern that matches non-alphanumeric characters. This replaces all special characters with an empty string, leaving only letters and numbers.
📐

Syntax

The common syntax to remove special characters using regular expressions is:

  • re.sub(pattern, replacement, string) - replaces parts of string matching pattern with replacement.
  • pattern - a regex pattern to match special characters, e.g., [^a-zA-Z0-9] means any character not a letter or number.
  • replacement - usually an empty string '' to remove matched characters.
python
import re

clean_string = re.sub(r'[^a-zA-Z0-9]', '', original_string)
💻

Example

This example shows how to remove all special characters from a string, keeping only letters and numbers.

python
import re

original_string = "Hello, World! Welcome to Python 3.9."
clean_string = re.sub(r'[^a-zA-Z0-9]', '', original_string)
print(clean_string)
Output
HelloWorldWelcometoPython39
⚠️

Common Pitfalls

One common mistake is to remove spaces unintentionally when you want to keep words separated. Using [^a-zA-Z0-9] removes spaces too. To keep spaces, include space in the allowed characters like [^a-zA-Z0-9 ].

Another pitfall is forgetting to import the re module before using re.sub().

python
import re

# Wrong: removes spaces too
text = "Hello, World!"
print(re.sub(r'[^a-zA-Z0-9]', '', text))  # Output: HelloWorld

# Right: keeps spaces
print(re.sub(r'[^a-zA-Z0-9 ]', '', text))  # Output: Hello World
Output
HelloWorld Hello World
📊

Quick Reference

Tips to remove special characters:

  • Use re.sub(r'[^a-zA-Z0-9]', '', text) to remove all except letters and numbers.
  • Add space inside brackets [^a-zA-Z0-9 ] to keep spaces.
  • Use raw strings r'' for regex patterns to avoid escape issues.
  • Remember to import re before using regex functions.

Key Takeaways

Use the re.sub() function with a regex pattern to remove special characters from strings.
Include spaces in the pattern if you want to keep spaces between words.
Always import the re module before using regex functions.
Use raw strings (r'') for regex patterns to avoid errors.
Test your pattern to ensure it removes only unwanted characters.