0
0
NLPml~20 mins

Custom pipeline components in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Custom Pipeline Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of a simple custom pipeline component
What will be the output of the following code that adds a custom component to a spaCy pipeline which counts tokens?
NLP
import spacy
from spacy.language import Language

@Language.component('token_counter')
def token_counter(doc):
    doc._.token_count = len(doc)
    return doc

nlp = spacy.blank('en')

# Register extension attribute
from spacy.tokens import Doc
Doc.set_extension('token_count', default=0)

nlp.add_pipe('token_counter')
doc = nlp('Hello world! This is a test.')
print(doc._.token_count)
ASyntaxError
B5
C6
D8
Attempts:
2 left
💡 Hint
Count the number of tokens in the sentence including punctuation.
Model Choice
intermediate
2:00remaining
Choosing the right custom pipeline component for sentiment analysis
You want to add a custom pipeline component that assigns a sentiment score to each document. Which component design is best?
AA component that deletes tokens from the Doc to keep only positive words.
BA component that modifies the Doc object by adding a custom attribute with the sentiment score.
CA component that returns None instead of a Doc object.
DA component that replaces the Doc object with a string containing the sentiment score.
Attempts:
2 left
💡 Hint
Remember that pipeline components must return a Doc object.
Hyperparameter
advanced
2:00remaining
Setting hyperparameters in a custom spaCy pipeline component
You want to create a custom pipeline component that filters tokens by a minimum length parameter. How should you pass this parameter to the component?
ADefine the component as a factory function that accepts the parameter and returns the actual component function.
BHardcode the minimum length inside the component function without parameters.
CPass the parameter as a global variable outside the component.
DSet the parameter inside the Doc object before processing.
Attempts:
2 left
💡 Hint
Think about how spaCy components are registered and initialized.
🔧 Debug
advanced
2:00remaining
Debugging a custom pipeline component that raises an error
Consider this custom component code snippet: @Language.component('uppercase_tokens') def uppercase_tokens(doc): for token in doc: token.text = token.text.upper() return doc Why does this code raise an error when added to the pipeline?
ABecause token.text is a read-only property and cannot be assigned to.
BBecause the function does not return a Doc object.
CBecause the pipeline does not support loops over tokens.
DBecause the component name is invalid.
Attempts:
2 left
💡 Hint
Check if token.text can be changed directly.
🧠 Conceptual
expert
3:00remaining
Understanding the order of custom pipeline components
You have two custom components: one that lemmatizes tokens and another that filters out stop words. Which order should you add them to the pipeline for best results?
AOrder does not matter; both can be added in any sequence.
BAdd the stop word filter first, then the lemmatizer.
CAdd the lemmatizer first, then the stop word filter.
DAdd both components simultaneously using add_pipe with the same name.
Attempts:
2 left
💡 Hint
Think about how lemmatization affects token forms before filtering.