0
0
NLPml~5 mins

NER with spaCy in NLP

Choose your learning style9 modes available
Introduction

NER helps find names of people, places, or things in text automatically. It makes reading and understanding text easier for computers.

Extracting names of people from news articles.
Finding locations mentioned in travel blogs.
Identifying dates and times in emails.
Pulling out company names from financial reports.
Highlighting product names in customer reviews.
Syntax
NLP
import spacy

# Load a pre-trained model
nlp = spacy.load('en_core_web_sm')

# Process text
text = "Apple is looking at buying U.K. startup for $1 billion"
doc = nlp(text)

# Extract entities
for ent in doc.ents:
    print(ent.text, ent.label_)

Use spacy.load() to load a language model with NER included.

Entities are accessed with doc.ents, each having text and label_.

Examples
Extracts person and location names from a simple sentence.
NLP
doc = nlp("Barack Obama was born in Hawaii.")
for ent in doc.ents:
    print(ent.text, ent.label_)
Shows how to get all entities as a list of tuples.
NLP
doc = nlp("Amazon plans to open a new office in Seattle in 2024.")
entities = [(ent.text, ent.label_) for ent in doc.ents]
print(entities)
Sample Model

This program finds names of people, organizations, and places in the text.

NLP
import spacy

# Load English model with NER
nlp = spacy.load('en_core_web_sm')

# Sample text
text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."

# Process text
doc = nlp(text)

# Print entities found
for ent in doc.ents:
    print(f"Entity: {ent.text}, Type: {ent.label_}")
OutputSuccess
Important Notes

spaCy's pre-trained models recognize common entity types like PERSON, ORG (organization), GPE (countries, cities), DATE, MONEY, etc.

NER works best on well-formed text; slang or typos may reduce accuracy.

You can train spaCy on your own data to recognize custom entities if needed.

Summary

NER finds important names and terms in text automatically.

spaCy makes NER easy with pre-trained models and simple code.

Extracted entities help computers understand text better for many applications.