Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to load a spaCy model for English.
NLP
import spacy nlp = spacy.load('[1]')
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using incorrect or made-up model names.
Forgetting to install the model before loading.
✗ Incorrect
The correct model name to load the small English model in spaCy is en_core_web_sm.
2fill in blank
mediumComplete the code to process text and get tokens using spaCy.
NLP
doc = nlp('[1]') tokens = [token.text for token in doc]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the nlp object itself instead of text.
Passing token attributes instead of raw text.
✗ Incorrect
You need to pass a string of text to the nlp object to process it and get tokens.
3fill in blank
hardFix the error in the code to get named entities from a spaCy doc.
NLP
for ent in doc.[1]: print(ent.text, ent.label_)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'entities' or 'named_entities' which do not exist.
Trying to iterate over 'tokens' for entities.
✗ Incorrect
The correct attribute to access named entities in a spaCy doc is ents.
4fill in blank
hardFill both blanks to create a dictionary of token texts and their parts of speech.
NLP
pos_dict = {token.[1]: token.[2] for token in doc} Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using lemma_ instead of text for keys.
Using tag_ instead of pos_ for parts of speech.
✗ Incorrect
Use text for token text and pos_ for part of speech tags.
5fill in blank
hardFill all three blanks to filter tokens that are alphabetic and lowercase their text.
NLP
filtered = [token.[1].lower() for token in doc if token.[2] and not token.[3]]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using is_stop instead of is_punct to exclude tokens.
Not converting text to lowercase.
✗ Incorrect
We use text to get token text, is_alpha to check if token is alphabetic, and exclude punctuation with is_punct.