Which spaCy model is the smallest and fastest but provides less detailed linguistic features?
Think about the naming convention where 'sm' stands for small.
The en_core_web_sm model is the smallest and fastest spaCy English model. It provides basic linguistic features but less accuracy compared to medium (md), large (lg), or transformer-based (trf) models.
What will be the output of the following code snippet?
import spacy nlp = spacy.load('en_core_web_sm') doc = nlp('Apple is looking at buying U.K. startup for $1 billion') print([(ent.text, ent.label_) for ent in doc.ents])
Look at common entity labels: ORG for organizations, GPE for geopolitical entities, MONEY for monetary values.
The model recognizes 'Apple' as an organization (ORG), 'U.K.' as a geopolitical entity (GPE), and '$1 billion' as money (MONEY). This is the typical output for this sentence using en_core_web_sm.
You want to use spaCy for a project that requires the highest accuracy using transformer models. Which model should you choose?
Transformer models usually have 'trf' in their name.
The en_core_web_trf model uses transformer architectures for better accuracy but requires more resources and is slower than other models.
When loading a spaCy model with spacy.load(), which parameter allows you to disable specific pipeline components to speed up processing?
Check spaCy documentation for the parameter name to disable components.
The disable parameter in spacy.load() disables specified pipeline components, which can speed up processing if those components are not needed.
What error will occur if you try to load a spaCy model that is not installed on your system using spacy.load('en_core_web_md')?
Think about spaCy's error message when a model is missing.
If the model is not installed, spaCy raises an OSError with a message indicating it cannot find the model.