0
0
NLPml~20 mins

Why NER extracts structured information in NLP - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
NER Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why does Named Entity Recognition (NER) extract structured information?

NER is used to find specific pieces of information in text, like names or dates. Why is this considered extracting structured information?

ABecause NER converts unorganized text into labeled categories like person, location, or date, making data easier to analyze.
BBecause NER translates text into another language to structure it.
CBecause NER removes all punctuation to create a clean text format.
DBecause NER summarizes the entire text into a short paragraph.
Attempts:
2 left
💡 Hint

Think about how NER tags parts of text with labels that computers can understand easily.

Predict Output
intermediate
2:00remaining
Output of NER entity extraction code

What is the output of this Python code using spaCy to extract entities?

NLP
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('Apple was founded by Steve Jobs in California.')
entities = [(ent.text, ent.label_) for ent in doc.ents]
print(entities)
A[('Apple', 'PERSON'), ('Steve Jobs', 'ORG'), ('California', 'LOC')]
B[('Apple', 'LOC'), ('Steve Jobs', 'GPE'), ('California', 'PERSON')]
C[('Apple', 'GPE'), ('Steve Jobs', 'PERSON'), ('California', 'ORG')]
D[('Apple', 'ORG'), ('Steve Jobs', 'PERSON'), ('California', 'GPE')]
Attempts:
2 left
💡 Hint

Remember that 'Apple' is a company (organization), 'Steve Jobs' is a person, and 'California' is a geopolitical entity.

Model Choice
advanced
2:00remaining
Choosing the best model for NER on noisy social media text

You want to extract structured information from tweets that contain slang, misspellings, and emojis. Which model is best suited for this NER task?

AA rule-based NER system using fixed dictionaries
BA simple logistic regression model trained on formal news articles
CA pre-trained BERT model fine-tuned on social media NER datasets
DA clustering algorithm that groups similar words without labels
Attempts:
2 left
💡 Hint

Consider which model can understand context and adapt to informal language.

Metrics
advanced
2:00remaining
Evaluating NER model performance with F1 score

An NER model predicted 80 entities correctly, missed 20 entities, and predicted 10 entities incorrectly. What is the F1 score?

A0.84
B0.80
C0.75
D0.88
Attempts:
2 left
💡 Hint

Calculate precision and recall first, then use F1 = 2 * (precision * recall) / (precision + recall).

🔧 Debug
expert
2:00remaining
Why does this NER model fail to extract entities from new domain text?

You trained an NER model on news articles but it performs poorly on medical reports. What is the most likely reason?

AThe model architecture is incorrect and cannot process text longer than 100 words.
BThe model was trained on a different domain and cannot generalize well to medical terms.
CThe training data had too many entities, causing overfitting.
DThe model uses a wrong loss function that ignores entity labels.
Attempts:
2 left
💡 Hint

Think about how domain differences affect model understanding.