NER helps find names of people, places, or things in text automatically. It makes reading and understanding text easier for computers.
NER with spaCy in NLP
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
NLP
import spacy # Load a pre-trained model nlp = spacy.load('en_core_web_sm') # Process text text = "Apple is looking at buying U.K. startup for $1 billion" doc = nlp(text) # Extract entities for ent in doc.ents: print(ent.text, ent.label_)
Use spacy.load() to load a language model with NER included.
Entities are accessed with doc.ents, each having text and label_.
Examples
NLP
doc = nlp("Barack Obama was born in Hawaii.") for ent in doc.ents: print(ent.text, ent.label_)
NLP
doc = nlp("Amazon plans to open a new office in Seattle in 2024.") entities = [(ent.text, ent.label_) for ent in doc.ents] print(entities)
Sample Model
This program finds names of people, organizations, and places in the text.
NLP
import spacy # Load English model with NER nlp = spacy.load('en_core_web_sm') # Sample text text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University." # Process text doc = nlp(text) # Print entities found for ent in doc.ents: print(f"Entity: {ent.text}, Type: {ent.label_}")
Important Notes
spaCy's pre-trained models recognize common entity types like PERSON, ORG (organization), GPE (countries, cities), DATE, MONEY, etc.
NER works best on well-formed text; slang or typos may reduce accuracy.
You can train spaCy on your own data to recognize custom entities if needed.
Summary
NER finds important names and terms in text automatically.
spaCy makes NER easy with pre-trained models and simple code.
Extracted entities help computers understand text better for many applications.
Practice
1. What does NER (Named Entity Recognition) do in natural language processing?
easy
Solution
Step 1: Understand NER's purpose
NER identifies specific names like people, places, or organizations in text.Step 2: Compare with other NLP tasks
Translation, summarization, and text generation are different tasks than NER.Final Answer:
It finds and labels important names and terms in text automatically. -> Option DQuick Check:
NER = Finds names and terms [OK]
Hint: NER extracts names and terms, not translations or summaries [OK]
Common Mistakes:
- Confusing NER with translation or summarization
- Thinking NER generates new text
- Believing NER only finds keywords, not named entities
2. Which of the following is the correct way to load a pre-trained spaCy model for NER?
easy
Solution
Step 1: Recall spaCy model loading syntax
spaCy uses spacy.load('model_name') to load pre-trained models.Step 2: Check each option
Only import spacy; nlp = spacy.load('en_core_web_sm') uses spacy.load correctly; others use invalid functions.Final Answer:
import spacy; nlp = spacy.load('en_core_web_sm') -> Option AQuick Check:
spaCy model loading = spacy.load() [OK]
Hint: Use spacy.load('model_name') to load models [OK]
Common Mistakes:
- Using spacy.model or spacy.load_model which don't exist
- Trying spacy.get which is not a spaCy function
- Forgetting to import spacy before loading
3. Given this code snippet using spaCy for NER:
What will be the output?
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('Apple is looking at buying U.K. startup for $1 billion')
entities = [(ent.text, ent.label_) for ent in doc.ents]
print(entities)What will be the output?
medium
Solution
Step 1: Understand spaCy NER labels
Apple is recognized as an organization (ORG), U.K. as geopolitical entity (GPE), and $1 billion as money (MONEY).Step 2: Match entities with labels
[('Apple', 'ORG'), ('U.K.', 'GPE'), ('$1 billion', 'MONEY')] correctly matches these entities and labels as spaCy outputs.Final Answer:
[('Apple', 'ORG'), ('U.K.', 'GPE'), ('$1 billion', 'MONEY')] -> Option CQuick Check:
spaCy NER output matches [('Apple', 'ORG'), ('U.K.', 'GPE'), ('$1 billion', 'MONEY')] [OK]
Hint: Check spaCy's common entity labels for correct matches [OK]
Common Mistakes:
- Confusing ORG with PERSON or GPE
- Mislabeling MONEY as QUANTITY
- Including words like 'startup' as entities
4. You run this code but get an error:
What is the most likely cause?
import spacy
doc = nlp('Google is a tech giant')What is the most likely cause?
medium
Solution
Step 1: Check variable definitions
The code uses 'nlp' without defining it by loading a spaCy model first.Step 2: Identify error cause
This causes a NameError because 'nlp' is undefined.Final Answer:
The variable 'nlp' is not defined before use. -> Option BQuick Check:
Undefined variable 'nlp' causes error [OK]
Hint: Always load model with spacy.load before using nlp [OK]
Common Mistakes:
- Assuming text length causes error
- Thinking spaCy can't recognize common words
- Confusing print syntax errors with variable errors
5. You want to extract only person names from a text using spaCy's NER. Which code snippet correctly filters for persons?
hard
Solution
Step 1: Identify label for persons in spaCy
spaCy uses 'PERSON' label for people names.Step 2: Filter entities by 'PERSON'
Filtering doc.ents by ent.label_ == 'PERSON' extracts only person names.Final Answer:
persons = [ent.text for ent in doc.ents if ent.label_ == 'PERSON'] -> Option AQuick Check:
Filter entities by 'PERSON' label [OK]
Hint: Filter entities with label_ == 'PERSON' to get names [OK]
Common Mistakes:
- Using wrong labels like ORG or GPE for persons
- Not filtering entities at all
- Confusing entity text with label
