NER helps find names of people, places, or things in text automatically. It makes reading and understanding text easier for computers.
0
0
NER with spaCy in NLP
Introduction
Extracting names of people from news articles.
Finding locations mentioned in travel blogs.
Identifying dates and times in emails.
Pulling out company names from financial reports.
Highlighting product names in customer reviews.
Syntax
NLP
import spacy # Load a pre-trained model nlp = spacy.load('en_core_web_sm') # Process text text = "Apple is looking at buying U.K. startup for $1 billion" doc = nlp(text) # Extract entities for ent in doc.ents: print(ent.text, ent.label_)
Use spacy.load() to load a language model with NER included.
Entities are accessed with doc.ents, each having text and label_.
Examples
Extracts person and location names from a simple sentence.
NLP
doc = nlp("Barack Obama was born in Hawaii.") for ent in doc.ents: print(ent.text, ent.label_)
Shows how to get all entities as a list of tuples.
NLP
doc = nlp("Amazon plans to open a new office in Seattle in 2024.") entities = [(ent.text, ent.label_) for ent in doc.ents] print(entities)
Sample Model
This program finds names of people, organizations, and places in the text.
NLP
import spacy # Load English model with NER nlp = spacy.load('en_core_web_sm') # Sample text text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University." # Process text doc = nlp(text) # Print entities found for ent in doc.ents: print(f"Entity: {ent.text}, Type: {ent.label_}")
OutputSuccess
Important Notes
spaCy's pre-trained models recognize common entity types like PERSON, ORG (organization), GPE (countries, cities), DATE, MONEY, etc.
NER works best on well-formed text; slang or typos may reduce accuracy.
You can train spaCy on your own data to recognize custom entities if needed.
Summary
NER finds important names and terms in text automatically.
spaCy makes NER easy with pre-trained models and simple code.
Extracted entities help computers understand text better for many applications.