0
0
NLPml~10 mins

Regular expressions for text cleaning in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the regular expressions module.

NLP
import [1]
Drag options to blanks, or click blank then click option'
Are
Bregexp
Cregex
Dtext
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'regex' or 'regexp' which are not standard Python modules.
Trying to import 'text' which is unrelated.
2fill in blank
medium

Complete the code to remove all digits from the text using regex substitution.

NLP
clean_text = re.sub(r'[1]', '', text)
Drag options to blanks, or click blank then click option'
A\D
B\d
C\s
D\w
Attempts:
3 left
💡 Hint
Common Mistakes
Using '\w' which removes letters and digits, not just digits.
Using '\s' which matches whitespace, not digits.
3fill in blank
hard

Fix the error in the regex pattern to remove punctuation marks from the text.

NLP
clean_text = re.sub(r'[[1]]', '', text)
Drag options to blanks, or click blank then click option'
A.,!?
B\.,!\?
C\.,!?
D\.,!\?\-
Attempts:
3 left
💡 Hint
Common Mistakes
Using patterns that do not include the dash or fail to handle special characters like '-' properly.
Missing some punctuation marks like dash '-' in the pattern.
4fill in blank
hard

Complete the code to create a regex that removes all whitespace characters including tabs and newlines.

NLP
clean_text = re.sub(r'[1]', '', text)
Drag options to blanks, or click blank then click option'
A\s
C\S
Attempts:
3 left
💡 Hint
Common Mistakes
Replacing with a space ' ' instead of empty string, which does not remove whitespace.
Using '\S' which matches non-whitespace characters.
5fill in blank
hard

Fill all three blanks to create a dictionary comprehension that maps words to their lengths, but only for words longer than 3 characters.

NLP
lengths = { [1]: [2] for [3] in words if len([3]) > 3 }
Drag options to blanks, or click blank then click option'
Aword
Blen(word)
Dw
Attempts:
3 left
💡 Hint
Common Mistakes
Using different variable names inconsistently.
Mapping keys or values incorrectly.