0
0
NLPml~5 mins

Training Word2Vec with Gensim in NLP - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is Word2Vec?
Word2Vec is a method to turn words into numbers (vectors) so that computers can understand the meaning of words based on their context in sentences.
Click to reveal answer
beginner
What are the two main architectures of Word2Vec?
The two main architectures are CBOW (Continuous Bag of Words) which predicts a word from its neighbors, and Skip-gram which predicts neighbors from a word.
Click to reveal answer
intermediate
In Gensim, how do you start training a Word2Vec model on a list of sentences?
You create a Word2Vec object with parameters like vector_size and window, then call the .build_vocab() method with your sentences, and finally call .train() to train the model.
Click to reveal answer
beginner
What does the 'window' parameter control in Word2Vec training?
The 'window' parameter controls how many words before and after the target word are considered as context during training.
Click to reveal answer
intermediate
How can you check the similarity between two words using a trained Word2Vec model in Gensim?
Use the model's .wv.similarity('word1', 'word2') method to get a score showing how similar the two words are based on their vectors.
Click to reveal answer
Which Gensim method is used to prepare the vocabulary before training Word2Vec?
Afit()
Btrain()
Cbuild_vocab()
Dinit_vocab()
What does the 'vector_size' parameter specify in Word2Vec?
ANumber of words in the vocabulary
BSize of the training batch
CNumber of training epochs
DLength of the word vectors
Which Word2Vec architecture predicts the center word from surrounding words?
ACBOW
BRNN
CSkip-gram
DTransformer
How do you save a trained Word2Vec model in Gensim?
Amodel.export('filename')
Bmodel.save('filename')
Cmodel.write('filename')
Dmodel.store('filename')
What type of data does Word2Vec expect for training?
AList of sentences, each sentence is a list of words
BSingle long string of text
CDictionary of word counts
DList of word vectors
Explain how to train a Word2Vec model using Gensim starting from raw text data.
Think about the steps from raw text to a trained model.
You got /5 concepts.
    Describe the difference between CBOW and Skip-gram architectures in Word2Vec.
    Focus on what each architecture tries to predict.
    You got /3 concepts.