0
0
NLPml~10 mins

Punctuation and special character removal in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to remove punctuation from the text using str.translate.

NLP
import string
text = "Hello, world!"
clean_text = text.[1](str.maketrans('', '', string.punctuation))
print(clean_text)
Drag options to blanks, or click blank then click option'
Astrip
Breplace
Ctranslate
Dsplit
Attempts:
3 left
💡 Hint
Common Mistakes
Using replace without a loop removes only one character at a time.
Using strip only removes characters from the start and end of the string.
2fill in blank
medium

Complete the code to remove all characters that are not letters or spaces using a list comprehension.

NLP
text = "Hello, world! 123"
clean_text = ''.join([c for c in text if [1] or c == ' '])
print(clean_text)
Drag options to blanks, or click blank then click option'
Ac.isalpha()
Bc.isdigit()
Cc.isupper()
Dc.isspace()
Attempts:
3 left
💡 Hint
Common Mistakes
Using isdigit() keeps numbers instead of letters.
Using isspace() only keeps spaces, removing letters.
3fill in blank
hard

Fix the error in the code to remove punctuation using regex.

NLP
import re
text = "Hello, world!"
clean_text = re.sub([1], '', text)
print(clean_text)
Drag options to blanks, or click blank then click option'
A'[a-zA-Z]'
B'[!?,.]'
C'[0-9]'
D'[^a-zA-Z ]'
Attempts:
3 left
💡 Hint
Common Mistakes
Using '[a-zA-Z]' removes letters instead of punctuation.
Using '[0-9]' removes digits only, not punctuation.
4fill in blank
hard

Fill both blanks to create a function that removes punctuation and converts text to lowercase.

NLP
import string

def clean_text(text):
    return text.[1](str.maketrans('', '', string.[2])).lower()
Drag options to blanks, or click blank then click option'
Atranslate
Breplace
Cpunctuation
Dwhitespace
Attempts:
3 left
💡 Hint
Common Mistakes
Using replace instead of translate does not remove all punctuation at once.
Using string.whitespace removes spaces, which we want to keep.
5fill in blank
hard

Fill all three blanks to create a dictionary comprehension that maps words to their cleaned versions without punctuation and in lowercase.

NLP
import string
words = ['Hello!', 'World?', 'Test.']
clean_words = {word[1]: word.[2](str.maketrans('', '', string.[3])).lower() for word in words}
print(clean_words)
Drag options to blanks, or click blank then click option'
A.lower()
Btranslate
Cpunctuation
D.strip()
Attempts:
3 left
💡 Hint
Common Mistakes
Using strip removes only whitespace, not punctuation.
Not converting to lowercase causes inconsistent keys.