Bird
Raised Fist0
NlpHow-ToBeginner · 3 min read

How to Do Named Entity Recognition (NER) in Python Easily

You can do Named Entity Recognition (NER) in Python easily using the spaCy library by loading a pre-trained model and calling nlp(text) to get entities. The entities are accessible via doc.ents, which gives you the recognized names, places, dates, and more.
📐

Syntax

To perform NER with spaCy, you first load a language model with spacy.load(). Then you process your text with nlp(text) to get a Doc object. The named entities are in doc.ents, each having .text (the entity string) and .label_ (the entity type).

python
import spacy

# Load the English model
nlp = spacy.load('en_core_web_sm')

# Process text
text = "Apple is looking at buying U.K. startup for $1 billion"
doc = nlp(text)

# Access entities
for ent in doc.ents:
    print(ent.text, ent.label_)
Output
Apple ORG U.K. GPE $1 billion MONEY
💻

Example

This example shows how to use spaCy to detect entities like organizations, locations, and money amounts in a sentence.

python
import spacy

# Load the English model
nlp = spacy.load('en_core_web_sm')

# Sample text
text = "Barack Obama was born in Hawaii and was the 44th president of the United States."

# Process the text
doc = nlp(text)

# Print detected entities and their labels
for ent in doc.ents:
    print(f"Entity: {ent.text}, Type: {ent.label_}")
Output
Entity: Barack Obama, Type: PERSON Entity: Hawaii, Type: GPE Entity: 44th, Type: ORDINAL Entity: United States, Type: GPE
⚠️

Common Pitfalls

  • Not installing the language model before loading it (e.g., running python -m spacy download en_core_web_sm is required).
  • Confusing entity labels (like PERSON vs ORG).
  • Using raw text without processing it through the nlp pipeline.
  • Expecting perfect results: NER models can miss or mislabel entities.
python
import spacy

# Wrong: forgetting to download model
# nlp = spacy.load('en_core_web_sm')  # This will error if model not installed

# Correct: install model first
# Run in terminal: python -m spacy download en_core_web_sm

nlp = spacy.load('en_core_web_sm')
text = "Google was founded by Larry Page and Sergey Brin."
doc = nlp(text)
for ent in doc.ents:
    print(ent.text, ent.label_)
Output
Google ORG Larry Page PERSON Sergey Brin PERSON
📊

Quick Reference

Here are some common entity labels you will see in spaCy's NER output:

Entity LabelMeaning
PERSONPeople, including fictional
ORGOrganizations, companies, agencies
GPECountries, cities, states
LOCNon-GPE locations, mountain ranges, bodies of water
DATEAbsolute or relative dates or periods
MONEYMonetary values, including unit
TIMETimes smaller than a day
ORDINAL“First”, “second”, etc.
CARDINALNumerals that do not fall under another type

Key Takeaways

Use spaCy's pre-trained models to quickly perform NER in Python.
Always process text with the nlp pipeline before accessing entities.
Install the required language model before loading it to avoid errors.
Entity labels help identify the type of named entity detected.
NER models are not perfect; review results for your use case.