0
0
NLPml~10 mins

Named entity recognition in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the library used for named entity recognition.

NLP
import [1]
Drag options to blanks, or click blank then click option'
Asklearn
Bpandas
Cmatplotlib
Dnltk
Attempts:
3 left
💡 Hint
Common Mistakes
Importing pandas or matplotlib which are not used for NLP.
Using sklearn which is mainly for machine learning but not specifically for NER.
2fill in blank
medium

Complete the code to tokenize the sentence for named entity recognition.

NLP
from nltk import word_tokenize
sentence = "Apple is looking at buying U.K. startup for $1 billion"
tokens = [1](sentence)
Drag options to blanks, or click blank then click option'
Aword_tokenize
Bsent_tokenize
Cpos_tag
Dne_chunk
Attempts:
3 left
💡 Hint
Common Mistakes
Using sent_tokenize which splits text into sentences, not words.
Using pos_tag or ne_chunk before tokenizing.
3fill in blank
hard

Fix the error in the code to perform named entity recognition on tokenized text.

NLP
from nltk import pos_tag, ne_chunk

tokens = ['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$', '1', 'billion']
pos_tags = [1](tokens)
named_entities = ne_chunk(pos_tags)
Drag options to blanks, or click blank then click option'
Aword_tokenize
Bpos_tag
Csent_tokenize
Dne_chunk
Attempts:
3 left
💡 Hint
Common Mistakes
Using word_tokenize on already tokenized list.
Using ne_chunk before pos_tag.
4fill in blank
hard

Fill both blanks to extract named entities as a list of tuples (entity, type).

NLP
entities = []
for subtree in named_entities.[1]():
    if hasattr(subtree, '[2]'):
        entity_name = ' '.join([token for token, pos in subtree.leaves()])
        entity_type = subtree.label()
        entities.append((entity_name, entity_type))
Drag options to blanks, or click blank then click option'
Asubtrees
Bchildren
Clabel
Dleaves
Attempts:
3 left
💡 Hint
Common Mistakes
Using children() which returns immediate children but not all subtrees.
Checking for leaves attribute instead of label.
5fill in blank
hard

Fill all three blanks to train a simple NER model using spaCy.

NLP
import spacy
from spacy.training import Example

nlp = spacy.blank('en')
ner = nlp.add_pipe('[1]')

TRAIN_DATA = [
    ("Google is a tech company", {"entities": [(0, 6, '[2]')]}),
    ("I live in New York", {"entities": [(10, 18, '[3]')]})
]

optimizer = nlp.begin_training()
for itn in range(10):
    for text, annotations in TRAIN_DATA:
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        nlp.update([example], sgd=optimizer)
Drag options to blanks, or click blank then click option'
Aner
BORG
CGPE
Dtextcat
Attempts:
3 left
💡 Hint
Common Mistakes
Adding 'textcat' pipe which is for text classification, not NER.
Using wrong entity labels like 'LOC' instead of 'GPE'.