Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to extract named entities from text using spaCy.
NLP
import spacy nlp = spacy.load('en_core_web_sm') doc = nlp('Apple is looking at buying U.K. startup for $1 billion') for ent in doc.[1]: print(ent.text, ent.label_)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'entities' instead of 'ents' causes an AttributeError.
Trying to iterate over 'tokens' instead of entities.
✗ Incorrect
The correct attribute to access named entities in a spaCy Doc object is 'ents'.
2fill in blank
mediumComplete the code to extract all email addresses from a text using a regular expression.
NLP
import re text = 'Contact us at support@example.com or sales@example.org' emails = re.findall(r'[1]', text) print(emails)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using regex for phone numbers or URLs instead of emails.
Missing escape characters in regex.
✗ Incorrect
The regex '[\w\.-]+@[\w\.-]+\.[a-zA-Z]{2,6}' matches typical email addresses.
3fill in blank
hardFix the error in the code to extract dates from text using spaCy's Matcher.
NLP
from spacy.matcher import Matcher import spacy nlp = spacy.load('en_core_web_sm') matcher = Matcher(nlp.vocab) pattern = [{'ENT_TYPE': '[1]'}] matcher.add('DATE_PATTERN', [pattern]) doc = nlp('We met on January 10th, 2023.') matches = matcher(doc) for match_id, start, end in matches: span = doc[start:end] print(span.text)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'TIME' or other entity types that do not match dates.
Not using the correct key 'ENT_TYPE' in the pattern.
✗ Incorrect
The entity type for dates in spaCy is 'DATE'.
4fill in blank
hardFill both blanks to create a dictionary of word frequencies from a text.
NLP
text = 'apple banana apple orange banana apple' words = text.split() freq = {word: [1] for word in words if words.count(word) [2] 1} print(freq)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using index instead of count for frequency.
Using '==' instead of '>' to filter words.
✗ Incorrect
We count occurrences of each word and keep only those with count greater than 1.
5fill in blank
hardFill all three blanks to extract entities and their labels into a dictionary using spaCy.
NLP
import spacy nlp = spacy.load('en_core_web_sm') doc = nlp('Google was founded in September 1998 by Larry Page and Sergey Brin.') entities = [1] for ent in doc.ents: entities[ent.[2]] = ent.[3] print(entities)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using list instead of dictionary for entities.
Using ent.label instead of ent.label_.
✗ Incorrect
We create an empty dictionary {}, then use ent.text as key and ent.label_ as value.