Recall & Review

beginner

What is lemmatization in natural language processing?

Lemmatization is the process of converting a word to its base or dictionary form, called a lemma. For example, 'running' becomes 'run'. It helps in understanding the meaning of words by grouping different forms of the same word.

Click to reveal answer

intermediate

How does spaCy perform lemmatization?

spaCy uses a built-in language model that includes rules and lookup tables to find the lemma of a word based on its context and part of speech. This helps spaCy return the correct base form of words during text processing.

Click to reveal answer

beginner

Which spaCy attribute gives the lemma of a token?

The attribute is token.lemma_. It returns the lemma as a string for each token in the processed text.

Click to reveal answer

intermediate

Why is lemmatization better than simple stemming?

Lemmatization returns real dictionary words as base forms, considering context and part of speech, while stemming just cuts word endings and may produce non-words. Lemmatization gives more accurate and meaningful results.

Click to reveal answer

beginner

Show a simple Python code snippet using spaCy to lemmatize the sentence: 'The cats are running quickly.'

import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('The cats are running quickly.')
lemmas = [token.lemma_ for token in doc]
print(lemmas)

This prints: ['the', 'cat', 'be', 'run', 'quickly', '.']

Click to reveal answer

What does the spaCy attribute token.lemma_ return?

AThe word's frequency in the text

BThe part of speech tag

CThe original word text

DThe base form of the word

Which of these is a benefit of lemmatization over stemming?

ARemoves stop words automatically

BRuns faster than stemming

CProduces real dictionary words

DIgnores word context

In spaCy, what must you do before accessing token.lemma_?

ALoad a language model and process text with <code>nlp()</code>

BManually define lemmas for each word

CCall a separate lemmatization function

DNothing, it works on raw text

What is the lemma of the word 'running' in spaCy's default English model?

Aran

Brun

Crunning

Drunner

Which spaCy model is commonly used for English lemmatization?

Aen_core_web_sm

Bfr_core_news_sm

Cde_core_news_sm

Dxx_ent_wiki_sm

Explain what lemmatization is and how spaCy helps perform it.

Write a short Python code example using spaCy to lemmatize a sentence and print the lemmas.

Practice

(1/5)

1. What does lemmatization do in natural language processing using spaCy?

easy

A. It removes all punctuation from the text.

B. It counts the number of words in a sentence.

C. It finds the base or dictionary form of a word.

D. It translates text into another language.

5. You want to lemmatize a list of sentences and count how many times the lemma 'run' appears using spaCy. Which code snippet correctly does this?

hard

A. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma_ == 'run' for token in doc) print(count)

B. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.text == 'run' for token in doc) print(count)

C. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma == 'run' for token in doc) print(count)

D. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma_ == 'running' for token in doc) print(count)

Lemmatization in spaCy in NLP - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of lemmatization

Step 2: Compare options to definition

Final Answer:

Quick Check:

Solution

Step 1: Recall spaCy token attribute for lemma

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand spaCy lemmatization output

Step 2: Match the list of lemmas

Final Answer:

Quick Check:

Solution

Step 1: Check spaCy lemma attribute usage

Step 2: Identify the error in code

Final Answer:

Quick Check:

Solution

Step 1: Understand the goal and spaCy usage

Step 2: Analyze each option

Final Answer:

Quick Check: