0
0
NLPml~5 mins

Custom NER training basics in NLP

Choose your learning style9 modes available
Introduction

Custom NER training helps a computer find special words in text that matter to you. It learns to spot names, places, or things you care about.

You want to find company names in emails automatically.
You need to spot product names in customer reviews.
You want to identify medical terms in health reports.
You want to extract dates and events from news articles.
You want to teach a chatbot to recognize custom terms.
Syntax
NLP
import spacy
from spacy.training.example import Example

# Load blank model
nlp = spacy.blank('en')

# Create NER component
ner = nlp.add_pipe('ner')

# Add labels
ner.add_label('CUSTOM_LABEL')

# Prepare training data
TRAIN_DATA = [
    ("Apple is a company", {"entities": [(0, 5, "CUSTOM_LABEL")]})
]

# Training loop
optimizer = nlp.begin_training()
for i in range(10):
    for text, annotations in TRAIN_DATA:
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        nlp.update([example], sgd=optimizer)

# Test
doc = nlp("Apple is big")
for ent in doc.ents:
    print(ent.text, ent.label_)

Use add_label to tell the model what new words to learn.

Training data needs text and the positions of special words.

Examples
This adds a new label called 'PRODUCT' for the model to learn.
NLP
ner.add_label('PRODUCT')
Training example showing 'Tesla' as an organization from position 7 to 12.
NLP
TRAIN_DATA = [("I love Tesla cars", {"entities": [(7, 12, "ORG")]})]
Runs training for 5 rounds to improve the model.
NLP
for i in range(5):
    nlp.update([example], sgd=optimizer)
Sample Model

This program trains a simple model to recognize 'Apple' as a fruit. It shows how to add a label, prepare data, train, and test.

NLP
import spacy
from spacy.training.example import Example

# Create blank English model
nlp = spacy.blank('en')

# Add NER pipe
ner = nlp.add_pipe('ner')

# Add custom label
ner.add_label('FRUIT')

# Training data with 'Apple' as FRUIT
TRAIN_DATA = [
    ("I like Apple", {"entities": [(7, 12, "FRUIT")]})
]

# Start training
optimizer = nlp.begin_training()

# Train for 10 iterations
for i in range(10):
    for text, annotations in TRAIN_DATA:
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        nlp.update([example], sgd=optimizer)

# Test the model
doc = nlp("Apple is tasty")
for ent in doc.ents:
    print(ent.text, ent.label_)
OutputSuccess
Important Notes

Training a custom NER model needs enough examples to learn well.

Positions in entities are start and end character indexes in the text.

Use a blank model to avoid confusion with existing labels.

Summary

Custom NER training teaches a model to find your special words.

You prepare text with labeled parts and train the model in loops.

After training, the model can spot your custom words in new text.