GloVe embeddings are used to represent words as vectors. What key information do these vectors capture?
Think about how GloVe uses word pairs appearing together in the whole text.
GloVe embeddings are trained by analyzing how often words appear together across the entire text corpus, capturing semantic relationships.
Given a loaded GloVe embedding dictionary glove_vectors where keys are words and values are vectors, what is the output of the code below?
word = 'king' vector = glove_vectors.get(word) print(len(vector))
Common GloVe embeddings come in sizes like 50, 100, 200, or 300 dimensions.
The standard GloVe embeddings often have 300 dimensions, so the vector length for 'king' is 300.
You want to build a sentiment analysis model on movie reviews. Which reason best justifies choosing pre-trained GloVe embeddings over training embeddings from scratch?
Think about the benefits of using embeddings trained on large text collections.
Pre-trained GloVe embeddings have learned word meanings from huge datasets, so they help models understand words better, especially when your own data is limited.
When training GloVe embeddings, what is the effect of increasing the embedding dimension size from 50 to 300?
Think about trade-offs between vector size and information captured.
Larger embedding dimensions allow capturing more detailed word relationships but increase memory use and computation time.
You trained GloVe embeddings on a custom corpus. Which metric best helps evaluate if the embeddings capture meaningful semantic relationships?
Think about how to measure if similar words have similar vectors.
Cosine similarity measures how close vectors are in direction, so high similarity for related words shows embeddings capture meaning well.