Challenge - 5 Problems

🎖️

Lemmatization Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output of this spaCy lemmatization code?

Given the following code snippet using spaCy, what will be the printed list of lemmas?

NLP

import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('The striped bats are hanging on their feet for best')
lemmas = [token.lemma_ for token in doc]
print(lemmas)

A['the', 'striped', 'bat', 'be', 'hang', 'on', 'their', 'foot', 'for', 'good']

B['the', 'striped', 'bat', 'be', 'hang', 'on', 'their', 'feet', 'for', 'best']

C['The', 'striped', 'bats', 'are', 'hanging', 'on', 'their', 'feet', 'for', 'best']

D['the', 'striped', 'bat', 'are', 'hang', 'on', 'their', 'feet', 'for', 'best']

Attempts:

2 left

❓ Model Choice

intermediate

1:30remaining

Which spaCy model is best for accurate lemmatization?

You want to perform lemmatization on English text with good accuracy and speed. Which spaCy model should you choose?

Aen_core_web_sm (small model)

Ben_vectors_web_lg (only word vectors, no lemmatization)

Cen_core_web_lg (large model)

Den_core_web_md (medium model)

Attempts:

2 left

❓ Metrics

advanced

1:30remaining

Which metric best evaluates lemmatization quality?

You have a dataset with gold-standard lemmas and your spaCy model's predicted lemmas. Which metric best measures lemmatization accuracy?

AExact match accuracy

BRecall

CPrecision

DF1 score

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Why does this spaCy lemmatization code raise an error?

Consider this code snippet: import spacy nlp = spacy.load('en_core_web_sm') text = 'Cats running fast' doc = nlp(text) lemmas = [token.lemma for token in doc] print(lemmas) Why does it raise an AttributeError?

A'doc' object is not iterable error

B'nlp' object is not callable error due to missing parentheses

C'text' variable is not defined before use

D'Token' object has no attribute 'lemma' because the correct attribute is 'lemma_'

Attempts:

2 left

🧠 Conceptual

expert

2:30remaining

Why might spaCy lemmatization keep 'feet' as 'feet' instead of 'foot'?

In spaCy, the word 'feet' is lemmatized as 'feet' instead of the expected singular 'foot'. What is the most likely reason?

AspaCy's lemmatizer uses a dictionary-based approach that sometimes keeps irregular plurals unchanged

BThe lemmatizer relies on part-of-speech tags and 'feet' is tagged as plural noun but lemmatizer lacks irregular plural rules

CThe model's vocabulary does not include 'foot' so it cannot lemmatize 'feet' correctly

DspaCy treats 'feet' as a plural noun but does not normalize irregular plurals to singular

Attempts:

2 left

Practice

(1/5)

1. What does lemmatization do in natural language processing using spaCy?

easy

A. It removes all punctuation from the text.

B. It counts the number of words in a sentence.

C. It finds the base or dictionary form of a word.

D. It translates text into another language.

5. You want to lemmatize a list of sentences and count how many times the lemma 'run' appears using spaCy. Which code snippet correctly does this?

hard

A. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma_ == 'run' for token in doc) print(count)

B. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.text == 'run' for token in doc) print(count)

C. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma == 'run' for token in doc) print(count)

D. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma_ == 'running' for token in doc) print(count)

Lemmatization in spaCy in NLP - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of lemmatization

Step 2: Compare options to definition

Final Answer:

Quick Check:

Solution

Step 1: Recall spaCy token attribute for lemma

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand spaCy lemmatization output

Step 2: Match the list of lemmas

Final Answer:

Quick Check:

Solution

Step 1: Check spaCy lemma attribute usage

Step 2: Identify the error in code

Final Answer:

Quick Check:

Solution

Step 1: Understand the goal and spaCy usage

Step 2: Analyze each option

Final Answer:

Quick Check: