0
0
NLPml~20 mins

Lowercasing and normalization in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Normalization Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this normalization code?
Consider the following Python code that normalizes a text by lowercasing and removing punctuation. What is the printed output?
NLP
import string
text = "Hello, World! Let's normalize this text."
normalized = ''.join(ch for ch in text.lower() if ch not in string.punctuation)
print(normalized)
AHello World Lets Normalize This Text
Bhello, world! let's normalize this text.
Chello world lets normalize this text
Dhello world let's normalize this text
Attempts:
2 left
💡 Hint
Think about what lowercasing and removing punctuation does to the original text.
🧠 Conceptual
intermediate
1:30remaining
Why is lowercasing important in text normalization?
Which of the following best explains why lowercasing is a common step in text normalization for machine learning?
AIt increases the length of the text to improve model accuracy.
BIt removes all punctuation from the text.
CIt translates text into a different language.
DIt reduces the number of unique words by treating 'Apple' and 'apple' as the same word.
Attempts:
2 left
💡 Hint
Think about how case differences affect word counts.
Metrics
advanced
2:00remaining
How does normalization affect model accuracy?
You train two text classification models: Model A uses raw text, Model B uses normalized text (lowercased, punctuation removed). Which outcome is most likely?
AModel A achieves higher accuracy because raw text has more information.
BModel B achieves higher accuracy because normalization reduces noise and vocabulary size.
CBoth models have the same accuracy because normalization does not affect text data.
DModel B performs worse because removing punctuation removes important meaning.
Attempts:
2 left
💡 Hint
Consider how noise and vocabulary size affect learning.
🔧 Debug
advanced
1:30remaining
Identify the error in this normalization code
What error does this code raise when run? import string text = "Normalize THIS!" normalized = ''.join(ch for ch in text.lower() if ch != string.punctuation) print(normalized)
ANo error, prints 'normalize this!'
BTypeError: 'in <string>' requires string as left operand, not 'str'
CTypeError: '!=' not supported between instances of 'str' and 'str'
DThe code prints 'normalize this' without punctuation
Attempts:
2 left
💡 Hint
Look carefully at how punctuation is checked in the condition.
Model Choice
expert
2:30remaining
Choosing the best normalization for noisy text data
You have a dataset of social media posts with many uppercase letters, emojis, and punctuation. Which normalization approach is best before training a sentiment analysis model?
ALowercase all text, remove punctuation, and remove emojis
BKeep original casing, keep punctuation, remove emojis
CLowercase all text, keep punctuation and emojis
DRemove punctuation and emojis, keep original casing
Attempts:
2 left
💡 Hint
Consider what noise elements can confuse the model and what information is useful.