Recall & Review

beginner

What is Word2Vec?

Word2Vec is a method to turn words into numbers (vectors) so that computers can understand the meaning of words based on their context in sentences.

Click to reveal answer

beginner

What are the two main architectures of Word2Vec?

The two main architectures are CBOW (Continuous Bag of Words) which predicts a word from its neighbors, and Skip-gram which predicts neighbors from a word.

Click to reveal answer

intermediate

In Gensim, how do you start training a Word2Vec model on a list of sentences?

You create a Word2Vec object with parameters like vector_size and window, then call the .build_vocab() method with your sentences, and finally call .train() to train the model.

Click to reveal answer

beginner

What does the 'window' parameter control in Word2Vec training?

The 'window' parameter controls how many words before and after the target word are considered as context during training.

Click to reveal answer

intermediate

How can you check the similarity between two words using a trained Word2Vec model in Gensim?

Use the model's .wv.similarity('word1', 'word2') method to get a score showing how similar the two words are based on their vectors.

Click to reveal answer

Which Gensim method is used to prepare the vocabulary before training Word2Vec?

Afit()

Btrain()

Cbuild_vocab()

Dinit_vocab()

What does the 'vector_size' parameter specify in Word2Vec?

ANumber of words in the vocabulary

BSize of the training batch

CNumber of training epochs

DLength of the word vectors

Which Word2Vec architecture predicts the center word from surrounding words?

ACBOW

BRNN

CSkip-gram

DTransformer

How do you save a trained Word2Vec model in Gensim?

Amodel.export('filename')

Bmodel.save('filename')

Cmodel.write('filename')

Dmodel.store('filename')

What type of data does Word2Vec expect for training?

AList of sentences, each sentence is a list of words

BSingle long string of text

CDictionary of word counts

DList of word vectors

Explain how to train a Word2Vec model using Gensim starting from raw text data.

Describe the difference between CBOW and Skip-gram architectures in Word2Vec.

Practice

(1/5)

1. What is the main purpose of training a Word2Vec model using Gensim?

easy

A. To count the frequency of words in a text

B. To translate text from one language to another

C. To convert words into meaningful number vectors

D. To remove stop words from a text

Training Word2Vec with Gensim in NLP - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand Word2Vec's goal

Step 2: Identify Gensim's role

Final Answer:

Quick Check:

Solution

Step 1: Recall Python import syntax

Step 2: Match Gensim's Word2Vec import

Final Answer:

Quick Check:

Solution

Step 1: Understand model.wv['word'] output

Step 2: Check training and vocabulary

Final Answer:

Quick Check:

Solution

Step 1: Check Word2Vec parameters

Step 2: Verify other code parts

Final Answer:

Quick Check:

Solution

Step 1: Analyze each change's effect on speed and quality

Step 2: Choose changes that speed up without much quality loss

Final Answer:

Quick Check: