What is Lemmatization in spaCy in NLP?

Lemmatization helps find the base form of words. It makes text easier to analyze by treating different forms of a word as one.

Lemmatization in spaCy in NLP - Syntax, Examples & Explanation

2. Which of the following is the correct way to get the lemma of a token in spaCy?

easy

A. token.lemma_

B. token.lemma

C. token.lemmatize()

D. token.get_lemma()

3. Given the code snippet:

import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('The cats are running fast')
lemmas = [token.lemma_ for token in doc]

What is the value of lemmas?

medium

A. ['the', 'cats', 'are', 'running', 'fast']

B. ['The', 'cats', 'are', 'running', 'fast']

C. ['The', 'cat', 'is', 'run', 'fast']

D. ['the', 'cat', 'be', 'run', 'fast']

4. Identify the error in this spaCy lemmatization code:

import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('She was eating apples')
lemmas = [token.lemma for token in doc]
print(lemmas)

medium

A. Missing parentheses in spacy.load()

B. Using token.lemma instead of token.lemma_

C. Incorrect model name in spacy.load()

D. Missing import for lemmatizer

5. You want to lemmatize a list of sentences and count how many times the lemma 'run' appears using spaCy. Which code snippet correctly does this?

hard

A. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma_ == 'run' for token in doc) print(count)

B. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.text == 'run' for token in doc) print(count)

C. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma == 'run' for token in doc) print(count)

D. import spacy nlp = spacy.load('en_core_web_sm') sentences = ['I run daily', 'He is running fast'] count = 0 for sent in sentences: doc = nlp(sent) count += sum(token.lemma_ == 'running' for token in doc) print(count)

Lemmatization in spaCy in NLP

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of lemmatization

Step 2: Compare options to definition

Final Answer:

Quick Check:

Solution

Step 1: Recall spaCy token attribute for lemma

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand spaCy lemmatization output

Step 2: Match the list of lemmas

Final Answer:

Quick Check:

Solution

Step 1: Check spaCy lemma attribute usage

Step 2: Identify the error in code

Final Answer:

Quick Check:

Solution

Step 1: Understand the goal and spaCy usage

Step 2: Analyze each option

Final Answer:

Quick Check: