Recall & Review

beginner

What is a regular expression (regex)?

A regular expression is a pattern of characters used to find or match text. It helps to search, replace, or clean text by describing what to look for.

Click to reveal answer

beginner

Why do we use regular expressions for text cleaning in machine learning?

We use regex to remove unwanted parts like extra spaces, special characters, or numbers from text. This makes the text easier for models to understand.

Click to reveal answer

intermediate

What does the regex pattern '\s+' match?

It matches one or more whitespace characters like spaces, tabs, or new lines. Useful to find extra spaces to clean or replace.

Click to reveal answer

intermediate

How can you remove all digits from a text using regex?

Use the pattern '\d' which matches any digit. Replace all matches with an empty string to remove digits.

Click to reveal answer

advanced

Explain the regex pattern '[^a-zA-Z ]' and its use in text cleaning.

This pattern matches any character that is NOT a letter (a-z or A-Z) or a space. It helps remove punctuation or special symbols from text.

Click to reveal answer

Which regex pattern matches one or more spaces?

A\s+

B\d+

C[a-z]+

D\w+

What does the regex '\d' match?

AAny whitespace

BAny letter

CAny digit

DAny special character

How would you remove punctuation from text using regex?

AReplace '[^a-zA-Z ]' with empty string

BReplace '\d' with empty string

CReplace '\s+' with empty string

DReplace '[a-z]' with empty string

Which regex pattern matches any word character (letters, digits, underscore)?

A\d

B\s

C[^a-zA-Z]

D\w

What is the purpose of using regex in text cleaning for machine learning?

ATo find and fix spelling errors

BTo find and remove unwanted text patterns

CTo add random characters

DTo translate text to another language

Describe how regular expressions help in cleaning text data for machine learning.

Explain the difference between '\s', '\d', and '[^a-zA-Z ]' regex patterns in text cleaning.

Practice

(1/5)

1. What is the main purpose of using regular expressions in text cleaning for NLP?

easy

A. To find and remove unwanted patterns or characters in text

B. To train machine learning models directly

C. To store large datasets efficiently

D. To visualize text data with graphs

Regular expressions for text cleaning in NLP - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of regular expressions

Step 2: Connect to text cleaning

Final Answer:

Quick Check:

Solution

Step 1: Recall Python's regex module name

Step 2: Check syntax correctness

Final Answer:

Quick Check:

Solution

Step 1: Understand the regex pattern used

Step 2: Apply re.sub to remove unwanted characters

Final Answer:

Quick Check:

Solution

Step 1: Check regex pattern correctness

Step 2: Verify code syntax and function usage

Final Answer:

Quick Check:

Solution

Step 1: Identify a regex pattern that matches URLs

Step 2: Understand the code's cleaning steps

Final Answer:

Quick Check: