How to Do Named Entity Recognition (NER) in Python Easily
You can do Named Entity Recognition (NER) in Python easily using the
spaCy library by loading a pre-trained model and calling nlp(text) to get entities. The entities are accessible via doc.ents, which gives you the recognized names, places, dates, and more.Syntax
To perform NER with spaCy, you first load a language model with spacy.load(). Then you process your text with nlp(text) to get a Doc object. The named entities are in doc.ents, each having .text (the entity string) and .label_ (the entity type).
python
import spacy # Load the English model nlp = spacy.load('en_core_web_sm') # Process text text = "Apple is looking at buying U.K. startup for $1 billion" doc = nlp(text) # Access entities for ent in doc.ents: print(ent.text, ent.label_)
Output
Apple ORG
U.K. GPE
$1 billion MONEY
Example
This example shows how to use spaCy to detect entities like organizations, locations, and money amounts in a sentence.
python
import spacy # Load the English model nlp = spacy.load('en_core_web_sm') # Sample text text = "Barack Obama was born in Hawaii and was the 44th president of the United States." # Process the text doc = nlp(text) # Print detected entities and their labels for ent in doc.ents: print(f"Entity: {ent.text}, Type: {ent.label_}")
Output
Entity: Barack Obama, Type: PERSON
Entity: Hawaii, Type: GPE
Entity: 44th, Type: ORDINAL
Entity: United States, Type: GPE
Common Pitfalls
- Not installing the language model before loading it (e.g., running
python -m spacy download en_core_web_smis required). - Confusing entity labels (like
PERSONvsORG). - Using raw text without processing it through the
nlppipeline. - Expecting perfect results: NER models can miss or mislabel entities.
python
import spacy # Wrong: forgetting to download model # nlp = spacy.load('en_core_web_sm') # This will error if model not installed # Correct: install model first # Run in terminal: python -m spacy download en_core_web_sm nlp = spacy.load('en_core_web_sm') text = "Google was founded by Larry Page and Sergey Brin." doc = nlp(text) for ent in doc.ents: print(ent.text, ent.label_)
Output
Google ORG
Larry Page PERSON
Sergey Brin PERSON
Quick Reference
Here are some common entity labels you will see in spaCy's NER output:
| Entity Label | Meaning |
|---|---|
| PERSON | People, including fictional |
| ORG | Organizations, companies, agencies |
| GPE | Countries, cities, states |
| LOC | Non-GPE locations, mountain ranges, bodies of water |
| DATE | Absolute or relative dates or periods |
| MONEY | Monetary values, including unit |
| TIME | Times smaller than a day |
| ORDINAL | “First”, “second”, etc. |
| CARDINAL | Numerals that do not fall under another type |
Key Takeaways
Use spaCy's pre-trained models to quickly perform NER in Python.
Always process text with the nlp pipeline before accessing entities.
Install the required language model before loading it to avoid errors.
Entity labels help identify the type of named entity detected.
NER models are not perfect; review results for your use case.
