Recall & Review

beginner

What is lowercasing in text preprocessing?

Lowercasing means converting all letters in text to lowercase. It helps treat words like 'Apple' and 'apple' as the same word.

Click to reveal answer

beginner

Why do we normalize text in NLP?

Normalization makes text consistent by fixing variations like accents, punctuation, or spacing. This helps models understand text better.

Click to reveal answer

intermediate

Give an example of text normalization besides lowercasing.

Removing accents (e.g., changing 'café' to 'cafe') or replacing multiple spaces with a single space are examples of normalization.

Click to reveal answer

intermediate

How does lowercasing affect model vocabulary size?

Lowercasing reduces vocabulary size by merging words that differ only in case, making the model simpler and faster.

Click to reveal answer

advanced

What is a potential downside of lowercasing?

Lowercasing can lose information, like proper nouns or acronyms, which might be important in some tasks.

Click to reveal answer

What does lowercasing do to the word 'Hello'?

ARemoves the word

BConverts it to 'HELLO'

CConverts it to 'hello'

DAdds punctuation

Which of these is NOT a normalization step?

AAdding random characters

BLowercasing

CRemoving accents

DReplacing multiple spaces with one

Why normalize text before training an NLP model?

ATo increase text length

BTo make text consistent and easier to understand

CTo add noise to data

DTo remove all vowels

What is a common effect of lowercasing on vocabulary size?

AVocabulary size increases

BVocabulary size doubles

CVocabulary size stays the same

DVocabulary size decreases

Which is a risk of lowercasing text?

ALosing important case information

BMaking text longer

CAdding accents

DRemoving stopwords

Explain why lowercasing and normalization are important in preparing text for machine learning models.

Describe some common normalization techniques used in NLP besides lowercasing.

Practice

(1/5)

1. What is the main purpose of lowercasing text in Natural Language Processing?

easy

A. To translate text into another language

B. To make all letters small so words like 'Apple' and 'apple' are treated the same

C. To remove all punctuation marks from the text

D. To split sentences into words

Lowercasing and normalization in NLP - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand what lowercasing does

Step 2: Understand why lowercasing is used

Final Answer:

Quick Check:

Solution

Step 1: Recall Python string method for lowercasing

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Apply lower() method on the string 'Café'

Step 2: Understand effect on accented characters

Final Answer:

Quick Check:

Solution

Step 1: Understand what normalize('NFKD') does

Step 2: Check the code behavior

Final Answer:

Quick Check:

Solution

Step 1: Lowercase the text

Step 2: Normalize and remove accents

Step 3: Combine steps correctly

Final Answer:

Quick Check: