Bird
Raised Fist0
NLPml~20 mins

Named entity recognition in NLP - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
NER Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:00remaining
What is the main goal of Named Entity Recognition (NER)?
Choose the best description of what Named Entity Recognition does in text processing.
ATranslate text from one language to another automatically.
BSummarize long documents into short paragraphs.
CIdentify and classify key information like names, places, and dates in text.
DDetect the sentiment or emotion expressed in text.
Attempts:
2 left
💡 Hint
Think about what entities like people or locations are in a sentence.
Predict Output
intermediate
1:30remaining
Output of NER model prediction on a sentence
Given the following code using spaCy to perform NER, what is the printed output?
NLP
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('Apple is looking at buying U.K. startup for $1 billion')
for ent in doc.ents:
    print(ent.text, ent.label_)
A
Apple ORG
U.K. GPE
$1 billion MONEY
B
Apple ORG
U.K. LOC
$1 billion MONEY
C
Apple PERSON
U.K. LOC
$1 billion QUANTITY
D
Apple ORG
U.K. ORG
$1 billion MONEY
Attempts:
2 left
💡 Hint
ORG means organization, GPE means geopolitical entity, MONEY means monetary values.
Model Choice
advanced
1:30remaining
Choosing the best model architecture for NER
Which model architecture is most suitable for Named Entity Recognition tasks?
AK-Means clustering for unsupervised grouping
BConvolutional Neural Network (CNN) for image classification
CGenerative Adversarial Network (GAN) for data generation
DRecurrent Neural Network (RNN) or Transformer-based models with token-level classification
Attempts:
2 left
💡 Hint
NER requires understanding sequences of words and their context.
Metrics
advanced
1:30remaining
Evaluating NER model performance
Which metric is most appropriate to evaluate the quality of a Named Entity Recognition model?
APrecision, Recall, and F1-score on entity-level
BBLEU score
CAccuracy on sentence classification
DMean Squared Error (MSE)
Attempts:
2 left
💡 Hint
NER evaluation focuses on correctly identifying and classifying entities.
🔧 Debug
expert
2:00remaining
Why does this NER model fail to recognize entities correctly?
Consider this Python code snippet using a custom NER model. After training, it predicts no entities on test sentences. What is the most likely cause?
NLP
from transformers import AutoTokenizer, AutoModelForTokenClassification, Trainer, TrainingArguments
from datasets import load_dataset

model_name = 'bert-base-cased'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name, num_labels=9)

# Dataset loading and tokenization omitted for brevity

training_args = TrainingArguments(output_dir='./results', num_train_epochs=3)
trainer = Trainer(model=model, args=training_args, train_dataset=train_dataset, eval_dataset=eval_dataset)
trainer.train()

# Prediction on test sentence
inputs = tokenizer('Microsoft was founded by Bill Gates.', return_tensors='pt')
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)
print(predictions)
AThe tokenizer is incompatible with the model architecture causing prediction errors.
BThe model was not fine-tuned on labeled NER data, so it cannot predict entities correctly.
CThe number of labels is set incorrectly to 9 instead of 2.
DThe input sentence is too short for the model to detect entities.
Attempts:
2 left
💡 Hint
Pretrained models need fine-tuning on task-specific data to perform well.

Practice

(1/5)
1. What is the main goal of Named Entity Recognition (NER) in natural language processing?
easy
A. To find and label names of people, places, and dates in text
B. To translate text from one language to another
C. To summarize long documents into short paragraphs
D. To generate new text based on input

Solution

  1. Step 1: Understand NER purpose

    NER is designed to identify and label specific types of information like names, places, and dates in text.
  2. Step 2: Compare with other NLP tasks

    Translation, summarization, and text generation are different tasks unrelated to labeling entities.
  3. Final Answer:

    To find and label names of people, places, and dates in text -> Option A
  4. Quick Check:

    NER = Labeling names in text [OK]
Hint: NER finds names and dates in text, not translations or summaries [OK]
Common Mistakes:
  • Confusing NER with translation or summarization
  • Thinking NER generates new text
  • Believing NER only finds keywords, not entities
2. Which of the following is the correct way to import a Named Entity Recognition pipeline using Hugging Face Transformers in Python?
easy
A. import pipeline from transformers; ner = pipeline('named_entity')
B. from transformers import pipeline; ner = pipeline('ner')
C. from transformers import ner_pipeline; ner = ner_pipeline()
D. import ner from transformers; ner = pipeline('ner')

Solution

  1. Step 1: Recall correct import syntax

    The Hugging Face library uses 'from transformers import pipeline' to import the pipeline function.
  2. Step 2: Check pipeline usage for NER

    Calling pipeline('ner') creates a named entity recognition pipeline correctly.
  3. Final Answer:

    from transformers import pipeline; ner = pipeline('ner') -> Option B
  4. Quick Check:

    Correct import and pipeline call = from transformers import pipeline; ner = pipeline('ner') [OK]
Hint: Use 'from transformers import pipeline' and call pipeline('ner') [OK]
Common Mistakes:
  • Using incorrect import syntax
  • Calling pipeline with wrong task name
  • Trying to import non-existent functions
3. Given the following Python code using Hugging Face Transformers NER pipeline:
from transformers import pipeline
ner = pipeline('ner')
text = "Barack Obama was born in Hawaii on August 4, 1961."
results = ner(text)
print(results)

What will be the output type of results?
medium
A. A single string with all entities concatenated
B. A dictionary with entity counts
C. A list of dictionaries with entity details
D. An integer representing number of entities

Solution

  1. Step 1: Understand pipeline output format

    The NER pipeline returns a list where each item is a dictionary describing an entity found in the text.
  2. Step 2: Check example output structure

    Each dictionary contains keys like 'entity', 'score', 'index', and 'word' describing the entity.
  3. Final Answer:

    A list of dictionaries with entity details -> Option C
  4. Quick Check:

    NER output = list of entity dictionaries [OK]
Hint: NER pipeline returns list of dicts, not strings or counts [OK]
Common Mistakes:
  • Expecting a single string output
  • Thinking output is a dictionary summary
  • Assuming output is just a count number
4. Consider this code snippet for NER using Hugging Face Transformers:
from transformers import pipeline
ner = pipeline('ner')
text = "Apple is looking at buying U.K. startup for $1 billion"
results = ner(text, grouped_entities=True)
print(results)

What is the likely error or issue here?
medium
A. The text input must be a list, not a string
B. The pipeline call is missing the model parameter
C. There is no error; code runs correctly
D. The argument 'grouped_entities' is invalid and causes a TypeError

Solution

  1. Step 1: Check pipeline argument validity

    The 'grouped_entities' argument is not supported in the current pipeline call and will raise a TypeError.
  2. Step 2: Confirm correct usage

    To group entities, the argument should be 'aggregation_strategy' with values like 'simple', not 'grouped_entities'.
  3. Final Answer:

    The argument 'grouped_entities' is invalid and causes a TypeError -> Option D
  4. Quick Check:

    Invalid argument causes error = The argument 'grouped_entities' is invalid and causes a TypeError [OK]
Hint: Check pipeline argument names carefully; 'grouped_entities' is wrong [OK]
Common Mistakes:
  • Using unsupported argument names
  • Assuming text input must be a list
  • Thinking missing model parameter causes error
5. You want to extract all person names and locations from a news article using NER. Which approach best ensures you only get these two entity types from the pipeline output?
hard
A. Filter the NER results by checking if the entity label is 'PER' or 'LOC'
B. Use the pipeline with task='ner' and no filtering
C. Manually search the text for capitalized words
D. Train a new model only on person and location data

Solution

  1. Step 1: Understand NER output labels

    NER results include entity labels like 'PER' for person and 'LOC' for location.
  2. Step 2: Filter results for desired entities

    Filtering the output by these labels extracts only person and location entities effectively.
  3. Final Answer:

    Filter the NER results by checking if the entity label is 'PER' or 'LOC' -> Option A
  4. Quick Check:

    Filter by labels 'PER' and 'LOC' to get persons and locations [OK]
Hint: Filter NER output by entity labels to get specific types [OK]
Common Mistakes:
  • Not filtering and getting all entity types
  • Trying manual text search instead of using labels
  • Assuming retraining is needed for filtering