Complete the code to import the library used for named entity recognition.
import [1]
The nltk library is commonly used for natural language processing tasks including named entity recognition.
Complete the code to tokenize the sentence for named entity recognition.
from nltk import word_tokenize sentence = "Apple is looking at buying U.K. startup for $1 billion" tokens = [1](sentence)
word_tokenize splits the sentence into words, which is the first step before tagging or chunking.
Fix the error in the code to perform named entity recognition on tokenized text.
from nltk import pos_tag, ne_chunk tokens = ['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$', '1', 'billion'] pos_tags = [1](tokens) named_entities = ne_chunk(pos_tags)
pos_tag is used to assign part-of-speech tags to tokens before named entity chunking.
Fill both blanks to extract named entities as a list of tuples (entity, type).
entities = [] for subtree in named_entities.[1](): if hasattr(subtree, '[2]'): entity_name = ' '.join([token for token, pos in subtree.leaves()]) entity_type = subtree.label() entities.append((entity_name, entity_type))
We use subtrees() to iterate over chunks and check if they have a label attribute to identify named entities.
Fill all three blanks to train a simple NER model using spaCy.
import spacy from spacy.training import Example nlp = spacy.blank('en') ner = nlp.add_pipe('[1]') TRAIN_DATA = [ ("Google is a tech company", {"entities": [(0, 6, '[2]')]}), ("I live in New York", {"entities": [(10, 18, '[3]')]}) ] optimizer = nlp.begin_training() for itn in range(10): for text, annotations in TRAIN_DATA: doc = nlp.make_doc(text) example = Example.from_dict(doc, annotations) nlp.update([example], sgd=optimizer)
We add the ner pipe to the model, and label entities as ORG (organization) and GPE (geopolitical entity).