Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to load the Sentence-BERT model for embeddings.
NLP
from sentence_transformers import SentenceTransformer model = SentenceTransformer('[1]')
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using a standard BERT model name instead of a Sentence-BERT model.
Using a language model like GPT-2 which is not for sentence embeddings.
✗ Incorrect
The 'all-MiniLM-L6-v2' model is a popular Sentence-BERT model for generating sentence embeddings.
2fill in blank
mediumComplete the code to generate embeddings for a list of sentences.
NLP
sentences = ['I love machine learning.', 'Sentence embeddings are useful.'] embeddings = model.[1](sentences)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'predict' which is for classification models.
Using 'fit' which is for training models.
✗ Incorrect
The 'encode' method generates embeddings from sentences using the Sentence-BERT model.
3fill in blank
hardFix the error in the code to correctly compute cosine similarity between two embeddings.
NLP
from sklearn.metrics.pairwise import [1] similarity = cosine_similarity([embeddings[0]], [embeddings[1]])[0][0]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using distance functions instead of similarity functions.
Using functions that return distances rather than similarity scores.
✗ Incorrect
The 'cosine_similarity' function computes the cosine similarity between vectors, which is commonly used for embeddings.
4fill in blank
hardFill both blanks to create a dictionary comprehension mapping sentences to their embedding lengths.
NLP
lengths = {sentence: len(embedding) for sentence, embedding in zip(sentences, [1])}
filtered = {k: v for k, v in lengths.items() if v [2] 384} Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'sentences' instead of 'embeddings' in the zip.
Using '<' instead of '>' for filtering.
✗ Incorrect
We use 'embeddings' to get the vectors and filter those with length greater than 384.
5fill in blank
hardFill all three blanks to create a dictionary of sentences and their cosine similarity scores above 0.7 with the first sentence.
NLP
from sklearn.metrics.pairwise import [1] scores = {sentence: [2]([embeddings[0]], [embedding])[0][0] for sentence, embedding in zip(sentences, embeddings)} filtered_scores = {k: v for k, v in scores.items() if v [3] 0.7}
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'cosine_distances' which returns distances, not similarity.
Using '<' instead of '>' for filtering.
✗ Incorrect
We import and use 'cosine_similarity' to compute scores and filter those greater than 0.7.