0
0
NLPml~20 mins

Information extraction patterns in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Information Extraction Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
What is the main purpose of named entity recognition (NER) in information extraction?

Named entity recognition (NER) is a common pattern in information extraction. What does NER primarily do?

ASummarize long documents into short paragraphs
BTranslate text from one language to another
CIdentify and classify key information like names, locations, and dates in text
DDetect the sentiment or emotion expressed in text
Attempts:
2 left
💡 Hint

Think about extracting specific types of information such as people or places.

Predict Output
intermediate
1:30remaining
What is the output of this simple pattern matching code?

Given the code below that extracts dates from text using a regex pattern, what is the output?

NLP
import re
text = 'The event is on 2024-07-15 and registration ends 2024-06-30.'
dates = re.findall(r'\d{4}-\d{2}-\d{2}', text)
print(dates)
A['2024', '07', '15', '2024', '06', '30']
B['2024-07-15', '2024-06-30']
C['15-07-2024', '30-06-2024']
D[]
Attempts:
2 left
💡 Hint

Look at the regex pattern and what it matches.

Model Choice
advanced
2:00remaining
Which model type is best suited for extracting relationships between entities in text?

You want to extract not just entities but also the relationships between them (like 'works_for' or 'located_in'). Which model type is best for this task?

ARelation extraction models using transformers with classification heads
BTopic modeling using Latent Dirichlet Allocation (LDA)
CSequence labeling models like BiLSTM-CRF
DText generation models like GPT
Attempts:
2 left
💡 Hint

Think about models that classify pairs of entities for relationships.

Metrics
advanced
1:30remaining
Which metric is most appropriate to evaluate an information extraction model that identifies entities?

You have a model that extracts entities from text. Which metric best measures how well it finds the correct entities?

ABLEU score
BMean Squared Error
CPerplexity
DF1 score
Attempts:
2 left
💡 Hint

Consider metrics that balance precision and recall.

🔧 Debug
expert
2:00remaining
Why does this entity extraction code fail to find any entities?

Consider the code below that tries to extract person names using a simple pattern. Why does it fail to find any matches?

NLP
import re
text = 'Alice and Bob went to the market.'
pattern = r'[A-Z][a-z]+'
matches = re.findall(pattern, text)
print(matches)
AThe pattern misses names because it only matches two-letter words starting with uppercase
BThe pattern is correct and should find all names
CThe code has a syntax error in the regex pattern
DThe text variable is empty
Attempts:
2 left
💡 Hint

Look carefully at what the regex pattern matches.