Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the main goal of Word2Vec?
Word2Vec aims to learn word meanings by turning words into numbers (vectors) so that words with similar meanings have similar vectors.
Click to reveal answer
beginner
Explain the Continuous Bag of Words (CBOW) model in Word2Vec.
CBOW predicts a target word based on its surrounding words (context). It looks at the words around a missing word and guesses what the missing word is.
Click to reveal answer
beginner
What does the Skip-gram model do in Word2Vec?
Skip-gram takes a target word and tries to predict the words around it (context). It learns which words tend to appear near the target word.
Click to reveal answer
beginner
Why is Word2Vec useful in real life?
Word2Vec helps computers understand language better, which improves things like search engines, chatbots, and translation by knowing word meanings and relationships.
Click to reveal answer
intermediate
How do CBOW and Skip-gram differ in training focus?
CBOW predicts the center word from context words, while Skip-gram predicts context words from the center word. CBOW is faster; Skip-gram works better with rare words.
Click to reveal answer
What does the Skip-gram model predict?
ASurrounding words from the target word
BThe target word from surrounding words
CThe next sentence in a paragraph
DThe frequency of a word in a document
✗ Incorrect
Skip-gram predicts the context words around a given target word.
Which Word2Vec model is generally faster to train?
ANeither, they are slow
BSkip-gram
CBoth are equally fast
DCBOW
✗ Incorrect
CBOW is usually faster because it predicts one word from many context words.
What is the main output of Word2Vec?
AWord frequency counts
BPart-of-speech tags
CWord vectors (embeddings)
DSentence summaries
✗ Incorrect
Word2Vec outputs word vectors that capture word meanings.
Which model is better for rare words?
ACBOW
BSkip-gram
CBoth perform the same
DNeither handles rare words
✗ Incorrect
Skip-gram works better with rare words by focusing on predicting context from the target word.
In CBOW, what is used to predict the target word?
AThe surrounding context words
BRandom words from the corpus
CThe entire sentence
DThe target word itself
✗ Incorrect
CBOW uses the surrounding words to predict the missing target word.
Describe how the CBOW and Skip-gram models work in Word2Vec and their main differences.
Think about which words are inputs and which are outputs in each model.
You got /4 concepts.
Explain why Word2Vec embeddings are useful for language tasks.
Consider how turning words into numbers helps machines.
You got /4 concepts.
Practice
(1/5)
1. What is the main difference between the CBOW and Skip-gram models in Word2Vec?
easy
A. CBOW uses one-hot encoding, Skip-gram uses frequency encoding.
B. CBOW predicts a word based on its context, while Skip-gram predicts context words from a target word.
C. CBOW is used only for sentences, Skip-gram only for paragraphs.
D. CBOW requires labeled data, Skip-gram does not.
Solution
Step 1: Understand CBOW model purpose
CBOW tries to predict the target word using the surrounding context words.
Step 2: Understand Skip-gram model purpose
Skip-gram tries to predict the surrounding context words given the target word.
Final Answer:
CBOW predicts a word based on its context, while Skip-gram predicts context words from a target word. -> Option B
Quick Check:
CBOW = context to word, Skip-gram = word to context [OK]
Hint: Remember CBOW = context to word, Skip-gram = word to context [OK]
Common Mistakes:
Confusing which model predicts context vs. target word
Thinking both models do the same prediction
Assuming CBOW needs labeled data
2. Which of the following is the correct way to initialize a Skip-gram Word2Vec model using the Gensim library in Python?
easy
A. Word2Vec(sentences, size=100, window=5, sg=0)
B. Word2Vec(sentences, vector_size=100, window=5, sg=0)
C. Word2Vec(sentences, size=100, window=5, sg=1)
D. Word2Vec(sentences, vector_size=100, window=5, sg=1)
Solution
Step 1: Identify correct parameter for Skip-gram
In Gensim, 'sg=1' sets Skip-gram, 'sg=0' sets CBOW.
Step 2: Use correct parameter names
Since Gensim 4.0+, 'vector_size' replaces 'size' for embedding dimension.
Final Answer:
Word2Vec(sentences, vector_size=100, window=5, sg=1) -> Option D
Quick Check:
sg=1 and vector_size used correctly [OK]
Hint: Use sg=1 for Skip-gram and vector_size for embedding size [OK]
Common Mistakes:
Using 'size' instead of 'vector_size' in recent Gensim versions
Setting sg=0 which is CBOW, not Skip-gram
Confusing sg parameter values
3. Given the following code snippet using Gensim's Word2Vec with Skip-gram, what will be the output of model.wv.most_similar('king', topn=1) if the model is trained on a typical English corpus?
medium
A. [('run', similarity_score)]
B. [('apple', similarity_score)]
C. [('queen', similarity_score)]
D. [('car', similarity_score)]
Solution
Step 1: Understand Word2Vec similarity
Word2Vec finds words with similar meanings or contexts; 'queen' is semantically close to 'king'.
Step 2: Analyze typical English corpus relations
Words like 'apple', 'car', or 'run' are unrelated to 'king' in meaning or context.
Final Answer:
[('queen', similarity_score)] -> Option C
Quick Check:
Most similar to 'king' is 'queen' [OK]
Hint: Most similar to 'king' is usually 'queen' in English corpora [OK]
Common Mistakes:
Choosing unrelated words as most similar
Confusing syntactic similarity with semantic similarity
Expecting exact similarity scores
4. You trained a CBOW Word2Vec model but get an error: KeyError: 'unknown_word' when querying model.wv['unknown_word']. What is the most likely cause and fix?
medium
A. The word was not in training data; retrain with larger corpus or check vocabulary before querying.
B. The model was trained with Skip-gram; switch to CBOW to fix.
C. The vector size is too small; increase vector_size parameter.
D. The window size is too large; reduce window parameter.
Solution
Step 1: Understand KeyError cause
KeyError occurs when the queried word is not in the model's vocabulary.
Step 2: Fix by ensuring word presence
Either add the word to training data or check if word exists before querying to avoid error.
Final Answer:
The word was not in training data; retrain with larger corpus or check vocabulary before querying. -> Option A
Quick Check:
KeyError means word missing in vocabulary [OK]
Hint: Check if word is in vocabulary before querying model vectors [OK]
Common Mistakes:
Assuming model type (CBOW/Skip-gram) causes KeyError
Changing vector or window size to fix missing word error
Ignoring vocabulary check before querying
5. You want to train a Word2Vec model to capture rare word meanings better. Which approach is best?
hard
A. Use Skip-gram with a smaller window size and increase training epochs.
B. Use CBOW with a large window size and fewer epochs.
C. Use Skip-gram with a large window size and fewer epochs.
D. Use CBOW with a smaller window size and increase training epochs.
Solution
Step 1: Identify model for rare words
Skip-gram is better at learning rare word representations than CBOW.
Step 2: Adjust window size and epochs
Smaller window focuses on close context, improving rare word meaning; more epochs improve training quality.
Final Answer:
Use Skip-gram with a smaller window size and increase training epochs. -> Option A
Quick Check:
Skip-gram + small window + more epochs = better rare word capture [OK]
Hint: Skip-gram + small window + more epochs helps rare words [OK]