Recall & Review
beginner
What is a regular expression (regex)?
A regular expression is a pattern of characters used to find or match text. It helps to search, replace, or clean text by describing what to look for.
Click to reveal answer
beginner
Why do we use regular expressions for text cleaning in machine learning?
We use regex to remove unwanted parts like extra spaces, special characters, or numbers from text. This makes the text easier for models to understand.
Click to reveal answer
intermediate
What does the regex pattern '\s+' match?
It matches one or more whitespace characters like spaces, tabs, or new lines. Useful to find extra spaces to clean or replace.
Click to reveal answer
intermediate
How can you remove all digits from a text using regex?
Use the pattern '\d' which matches any digit. Replace all matches with an empty string to remove digits.
Click to reveal answer
advanced
Explain the regex pattern '[^a-zA-Z ]' and its use in text cleaning.
This pattern matches any character that is NOT a letter (a-z or A-Z) or a space. It helps remove punctuation or special symbols from text.
Click to reveal answer
Which regex pattern matches one or more spaces?
✗ Incorrect
The pattern '\s+' matches one or more whitespace characters like spaces or tabs.
What does the regex '\d' match?
✗ Incorrect
The '\d' pattern matches any digit from 0 to 9.
How would you remove punctuation from text using regex?
✗ Incorrect
The pattern '[^a-zA-Z ]' matches anything that is not a letter or space, so replacing it removes punctuation.
Which regex pattern matches any word character (letters, digits, underscore)?
✗ Incorrect
'\w' matches any word character including letters, digits, and underscore.
What is the purpose of using regex in text cleaning for machine learning?
✗ Incorrect
Regex helps find and remove unwanted patterns like extra spaces, digits, or punctuation to clean text.
Describe how regular expressions help in cleaning text data for machine learning.
Think about how patterns can find spaces, digits, or symbols to remove.
You got /4 concepts.
Explain the difference between '\s', '\d', and '[^a-zA-Z ]' regex patterns in text cleaning.
Consider what kinds of characters each pattern targets.
You got /4 concepts.