Challenge - 5 Problems

🎖️

Word2Vec Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Understanding Word2Vec Training Objective

Which of the following best describes the training objective of the Word2Vec Skip-gram model?

APredict the center word given the context words.

BCluster words into groups based on their frequency.

CReduce the dimensionality of word vectors using PCA.

DPredict the context words given the center word.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of Word2Vec Vector Similarity Calculation

Given the following Python code using Gensim's Word2Vec, what is the output of the similarity calculation?

ML Python

from gensim.models import Word2Vec
sentences = [['cat', 'sat', 'on', 'the', 'mat'], ['dog', 'barked', 'at', 'the', 'cat']]
model = Word2Vec(sentences, vector_size=10, window=2, min_count=1, epochs=10)
sim = model.wv.similarity('cat', 'dog')
print(round(sim, 2))

AA float value close to 0.5

BA float value close to 1.0

CA float value close to -1.0

DRaises a KeyError because 'dog' is not in vocabulary

Attempts:

2 left

❓ Model Choice

advanced

2:00remaining

Choosing the Right Word2Vec Architecture

You want to train word embeddings that capture rare word relationships well in a small dataset. Which Word2Vec architecture is best?

AHierarchical clustering model

BSkip-gram model

CPrincipal Component Analysis (PCA) on one-hot vectors

DContinuous Bag of Words (CBOW) model

Attempts:

2 left

❓ Hyperparameter

advanced

2:00remaining

Effect of Window Size in Word2Vec

What is the effect of increasing the window size parameter in Word2Vec training?

AThe model considers a larger context around each word, capturing broader semantic relationships.

BThe model trains faster because it uses fewer context words.

CThe model ignores stop words during training.

DThe model reduces the vector size to speed up training.

Attempts:

2 left

❓ Metrics

expert

2:00remaining

Evaluating Word2Vec Embeddings Quality

Which metric is commonly used to evaluate the quality of trained Word2Vec embeddings on a word analogy task?

ASilhouette score of word clusters

BCross-entropy loss on training data

CAccuracy of correctly answered analogy questions

DMean Squared Error between embeddings

Attempts:

2 left