Bird
0
0

Which characteristic of FastText embeddings enables them to generate vectors for words not present in the training data?

easy📝 Conceptual Q1 of 15
NLP - Word Embeddings
Which characteristic of FastText embeddings enables them to generate vectors for words not present in the training data?
AThey use one-hot encoding for each word
BThey represent words as bags of character n-grams
CThey rely solely on word frequency counts
DThey use fixed-length word hashing
Step-by-Step Solution
Solution:
  1. Step 1: Understand FastText's approach

    FastText breaks words into character n-grams, allowing it to capture subword information.
  2. Step 2: Compare with traditional embeddings

    Traditional embeddings treat words as atomic units, so unseen words have no vectors.
  3. Final Answer:

    They represent words as bags of character n-grams -> Option B
  4. Quick Check:

    Subword modeling enables embeddings for unseen words [OK]
Quick Trick: FastText uses subword units for unseen words [OK]
Common Mistakes:
MISTAKES
  • Assuming FastText uses one-hot encoding
  • Thinking FastText ignores character-level info
  • Believing FastText relies only on word frequency

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes