Custom NER training helps a computer find special words in text that matter to you. It learns to spot names, places, or things you care about.
0
0
Custom NER training basics in NLP
Introduction
You want to find company names in emails automatically.
You need to spot product names in customer reviews.
You want to identify medical terms in health reports.
You want to extract dates and events from news articles.
You want to teach a chatbot to recognize custom terms.
Syntax
NLP
import spacy from spacy.training.example import Example # Load blank model nlp = spacy.blank('en') # Create NER component ner = nlp.add_pipe('ner') # Add labels ner.add_label('CUSTOM_LABEL') # Prepare training data TRAIN_DATA = [ ("Apple is a company", {"entities": [(0, 5, "CUSTOM_LABEL")]}) ] # Training loop optimizer = nlp.begin_training() for i in range(10): for text, annotations in TRAIN_DATA: doc = nlp.make_doc(text) example = Example.from_dict(doc, annotations) nlp.update([example], sgd=optimizer) # Test doc = nlp("Apple is big") for ent in doc.ents: print(ent.text, ent.label_)
Use add_label to tell the model what new words to learn.
Training data needs text and the positions of special words.
Examples
This adds a new label called 'PRODUCT' for the model to learn.
NLP
ner.add_label('PRODUCT')Training example showing 'Tesla' as an organization from position 7 to 12.
NLP
TRAIN_DATA = [("I love Tesla cars", {"entities": [(7, 12, "ORG")]})]
Runs training for 5 rounds to improve the model.
NLP
for i in range(5): nlp.update([example], sgd=optimizer)
Sample Model
This program trains a simple model to recognize 'Apple' as a fruit. It shows how to add a label, prepare data, train, and test.
NLP
import spacy from spacy.training.example import Example # Create blank English model nlp = spacy.blank('en') # Add NER pipe ner = nlp.add_pipe('ner') # Add custom label ner.add_label('FRUIT') # Training data with 'Apple' as FRUIT TRAIN_DATA = [ ("I like Apple", {"entities": [(7, 12, "FRUIT")]}) ] # Start training optimizer = nlp.begin_training() # Train for 10 iterations for i in range(10): for text, annotations in TRAIN_DATA: doc = nlp.make_doc(text) example = Example.from_dict(doc, annotations) nlp.update([example], sgd=optimizer) # Test the model doc = nlp("Apple is tasty") for ent in doc.ents: print(ent.text, ent.label_)
OutputSuccess
Important Notes
Training a custom NER model needs enough examples to learn well.
Positions in entities are start and end character indexes in the text.
Use a blank model to avoid confusion with existing labels.
Summary
Custom NER training teaches a model to find your special words.
You prepare text with labeled parts and train the model in loops.
After training, the model can spot your custom words in new text.