0
0
NLPml~20 mins

Punctuation and special character removal in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Punctuation Pro
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of punctuation removal code
What is the output of the following Python code that removes punctuation and special characters from a text string?
NLP
import re
text = "Hello, world! Welcome to AI & ML." 
clean_text = re.sub(r'[^\w\s]', '', text)
print(clean_text)
AHello world Welcome to AI ML
BHello, world! Welcome to AI & ML.
CHello world! Welcome to AI & ML
DHello world Welcome to AI & ML.
Attempts:
2 left
💡 Hint
Look at the regular expression pattern used in re.sub to remove characters.
Model Choice
intermediate
2:00remaining
Best model for text preprocessing with punctuation removal
Which model type is best suited for handling text data that requires punctuation and special character removal before training?
AConvolutional Neural Network (CNN) for image classification
BRecurrent Neural Network (RNN) for sequence data
CLinear Regression for numerical prediction
DK-Means clustering for unsupervised grouping
Attempts:
2 left
💡 Hint
Think about models designed to process sequences of words.
Hyperparameter
advanced
2:00remaining
Choosing tokenizer settings for punctuation removal
When using a tokenizer in NLP, which setting helps ensure punctuation and special characters are removed during tokenization?
ASet 'filters' parameter to include punctuation characters
BSet 'strip_accents=True' to remove accents from characters
CSet 'lowercase=True' to convert all text to lowercase
DSet 'max_len' to limit the maximum token length
Attempts:
2 left
💡 Hint
Look for the parameter that controls which characters are removed during tokenization.
Metrics
advanced
2:00remaining
Effect of punctuation removal on text classification accuracy
If you train a text classification model on raw text and then on text with punctuation removed, what is the most likely effect on accuracy?
AAccuracy will always increase because punctuation adds noise
BAccuracy remains exactly the same regardless of punctuation
CAccuracy will always decrease because punctuation carries important meaning
DAccuracy may increase or decrease depending on the dataset and task
Attempts:
2 left
💡 Hint
Consider how punctuation might affect different types of text data.
🔧 Debug
expert
2:00remaining
Debugging punctuation removal code
What error does the following code raise when trying to remove punctuation from text? import string text = "Hello, world!" clean_text = text.translate(str.maketrans('', '', string.punctuation)) print(clean_text)
NLP
import string
text = "Hello, world!"
clean_text = text.translate(str.maketrans('', '', string.punctuation))
print(clean_text)
ATypeError: translate() argument must be a mapping or None
BSyntaxError: invalid syntax in maketrans
CNo error, output: Hello world
DAttributeError: 'str' object has no attribute 'translate'
Attempts:
2 left
💡 Hint
Check the usage of str.maketrans and translate methods.