0
0
NLPml~10 mins

Word2Vec (CBOW and Skip-gram) in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the Word2Vec model from gensim.

NLP
from gensim.models import [1]
Drag options to blanks, or click blank then click option'
AWord2Vec
BFastText
CDoc2Vec
DLdaModel
Attempts:
3 left
💡 Hint
Common Mistakes
Importing unrelated models like FastText or Doc2Vec instead of Word2Vec.
2fill in blank
medium

Complete the code to initialize a CBOW Word2Vec model with vector size 100.

NLP
model = Word2Vec(sentences, vector_size=[1], window=5, sg=0, min_count=1)
Drag options to blanks, or click blank then click option'
A50
B200
C100
D300
Attempts:
3 left
💡 Hint
Common Mistakes
Using vector_size values that are too small or too large without reason.
3fill in blank
hard

Fix the error in the code to train a Skip-gram Word2Vec model.

NLP
model = Word2Vec(sentences, vector_size=100, window=5, sg=[1], min_count=1)
Drag options to blanks, or click blank then click option'
A0
B1
C2
D-1
Attempts:
3 left
💡 Hint
Common Mistakes
Using sg=0 which trains CBOW instead of Skip-gram.
Using invalid values like 2 or -1 for sg.
4fill in blank
hard

Fill both blanks to create a dictionary of word vectors for words with frequency above 2.

NLP
word_vectors = {word: model.wv[[1]] for word in model.wv.index_to_key if model.wv.get_vecattr(word, '[2]') > 2}
Drag options to blanks, or click blank then click option'
Aword
Bcount
Cfrequency
Dindex
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'frequency' instead of 'count' for the attribute name.
Using 'index' instead of the actual word to get vectors.
5fill in blank
hard

Fill all three blanks to find the top 3 most similar words to 'king'.

NLP
similar_words = model.wv.most_similar(positive=[[1]], topn=[2])
result = [word for word, [3] in similar_words]
Drag options to blanks, or click blank then click option'
A'king'
B3
Csimilarity
D'queen'
Attempts:
3 left
💡 Hint
Common Mistakes
Using the wrong variable name instead of 'similarity' in the unpacking.
Passing a word not in quotes to the positive list.