0
0
NLPml~20 mins

FastText embeddings in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
FastText Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
How does FastText handle out-of-vocabulary words?

FastText can create embeddings for words not seen during training. How does it do this?

AIt uses a dictionary of synonyms to find a known word vector.
BIt ignores unknown words and assigns them a zero vector.
CIt breaks words into character n-grams and sums their vectors to form the word vector.
DIt randomly initializes a vector for unknown words each time.
Attempts:
2 left
💡 Hint

Think about how FastText uses parts of words to build vectors.

Predict Output
intermediate
2:00remaining
Output of FastText vector dimension

What is the output of the following code snippet?

NLP
from gensim.models import FastText
sentences = [['hello', 'world'], ['machine', 'learning']]
model = FastText(sentences, vector_size=10, window=3, min_count=1, epochs=5)
print(len(model.wv['hello']))
A5
B10
C3
D1
Attempts:
2 left
💡 Hint

Check the vector_size parameter used when creating the model.

Model Choice
advanced
2:00remaining
Choosing FastText for morphologically rich languages

You want to build word embeddings for a language with many word forms and suffixes. Which model is best suited?

AWord2Vec, because it learns embeddings only from whole words.
BOne-hot encoding, because it is simple and effective.
CGloVe, because it uses global co-occurrence statistics.
DFastText, because it uses subword information to handle word variations.
Attempts:
2 left
💡 Hint

Consider how subword information helps with many word forms.

Metrics
advanced
2:00remaining
Evaluating FastText embeddings with analogy tasks

You evaluate FastText embeddings on a word analogy task (e.g., king - man + woman = ?). Which metric best measures performance?

AAccuracy of correctly predicted analogy words.
BMean squared error between vectors.
CPerplexity of the embedding model.
DLoss value during training.
Attempts:
2 left
💡 Hint

Think about how analogy tasks are scored.

🔧 Debug
expert
3:00remaining
Debugging FastText training convergence issue

You train a FastText model but notice the loss does not decrease after many epochs. Which is the most likely cause?

AThe learning rate is too high, causing the model to overshoot minima.
BThe vector size is too large, causing overfitting.
CThe window size is too small, so context is ignored.
DThe training data is too large, causing slow convergence.
Attempts:
2 left
💡 Hint

Consider how learning rate affects training stability.