Which of the following best describes the training objective of the Word2Vec Skip-gram model?
Think about what the model tries to guess during training.
The Skip-gram model learns to predict surrounding context words given a center word, helping it capture word relationships.
Given the following Python code using Gensim's Word2Vec, what is the output of the similarity calculation?
from gensim.models import Word2Vec sentences = [['cat', 'sat', 'on', 'the', 'mat'], ['dog', 'barked', 'at', 'the', 'cat']] model = Word2Vec(sentences, vector_size=10, window=2, min_count=1, epochs=10) sim = model.wv.similarity('cat', 'dog') print(round(sim, 2))
Both 'cat' and 'dog' appear in the sentences, so similarity is computed.
The similarity between 'cat' and 'dog' is positive but less than 1 because they appear in related contexts but are different words.
You want to train word embeddings that capture rare word relationships well in a small dataset. Which Word2Vec architecture is best?
One architecture is better at learning from rare words.
The Skip-gram model performs better on rare words because it predicts context from a single word, allowing it to learn better representations for infrequent words.
What is the effect of increasing the window size parameter in Word2Vec training?
Window size controls how many words around the center word are used.
A larger window size means the model looks at more surrounding words, helping it learn broader meanings but possibly less syntactic detail.
Which metric is commonly used to evaluate the quality of trained Word2Vec embeddings on a word analogy task?
Think about how well embeddings solve analogy puzzles like 'king - man + woman = ?'
Accuracy on analogy tasks measures how often the embeddings correctly solve word relationship puzzles, reflecting semantic quality.