0
0
NLPml~10 mins

Information extraction patterns in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to extract named entities from text using spaCy.

NLP
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('Apple is looking at buying U.K. startup for $1 billion')
for ent in doc.[1]:
    print(ent.text, ent.label_)
Drag options to blanks, or click blank then click option'
Aentities
Bents
Ctokens
Dspans
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'entities' instead of 'ents' causes an AttributeError.
Trying to iterate over 'tokens' instead of entities.
2fill in blank
medium

Complete the code to extract all email addresses from a text using a regular expression.

NLP
import re
text = 'Contact us at support@example.com or sales@example.org'
emails = re.findall(r'[1]', text)
print(emails)
Drag options to blanks, or click blank then click option'
A\b\w+\b
B\d{3}-\d{2}-\d{4}
C[\w\.-]+@[\w\.-]+\.[a-zA-Z]{2,6}
Dhttps?://[\w\.-]+
Attempts:
3 left
💡 Hint
Common Mistakes
Using regex for phone numbers or URLs instead of emails.
Missing escape characters in regex.
3fill in blank
hard

Fix the error in the code to extract dates from text using spaCy's Matcher.

NLP
from spacy.matcher import Matcher
import spacy
nlp = spacy.load('en_core_web_sm')
matcher = Matcher(nlp.vocab)
pattern = [{'ENT_TYPE': '[1]'}]
matcher.add('DATE_PATTERN', [pattern])
doc = nlp('We met on January 10th, 2023.')
matches = matcher(doc)
for match_id, start, end in matches:
    span = doc[start:end]
    print(span.text)
Drag options to blanks, or click blank then click option'
ADATE
BTIME
CPERSON
DORG
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'TIME' or other entity types that do not match dates.
Not using the correct key 'ENT_TYPE' in the pattern.
4fill in blank
hard

Fill both blanks to create a dictionary of word frequencies from a text.

NLP
text = 'apple banana apple orange banana apple'
words = text.split()
freq = {word: [1] for word in words if words.count(word) [2] 1}
print(freq)
Drag options to blanks, or click blank then click option'
Awords.count(word)
Bwords.index(word)
C>
D==
Attempts:
3 left
💡 Hint
Common Mistakes
Using index instead of count for frequency.
Using '==' instead of '>' to filter words.
5fill in blank
hard

Fill all three blanks to extract entities and their labels into a dictionary using spaCy.

NLP
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('Google was founded in September 1998 by Larry Page and Sergey Brin.')
entities = [1]
for ent in doc.ents:
    entities[ent.[2]] = ent.[3]
print(entities)
Drag options to blanks, or click blank then click option'
A{}
Btext
Clabel_
Dlist
Attempts:
3 left
💡 Hint
Common Mistakes
Using list instead of dictionary for entities.
Using ent.label instead of ent.label_.