Challenge - 5 Problems
Sentence-BERT Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate1:30remaining
What is the shape of the embeddings produced by Sentence-BERT?
Given the following code snippet that uses Sentence-BERT to encode a list of sentences, what is the shape of the resulting embeddings array?
NLP
from sentence_transformers import SentenceTransformer sentences = ['I love machine learning.', 'Sentence-BERT creates embeddings.'] model = SentenceTransformer('all-MiniLM-L6-v2') embeddings = model.encode(sentences) print(embeddings.shape)
Attempts:
2 left
💡 Hint
The 'all-MiniLM-L6-v2' model produces 384-dimensional embeddings for each sentence.
✗ Incorrect
Sentence-BERT models like 'all-MiniLM-L6-v2' output embeddings with 384 dimensions. Since there are 2 sentences, the shape is (2, 384).
❓ Model Choice
intermediate1:30remaining
Which Sentence-BERT model is best for fast embedding generation on CPU?
You want to generate sentence embeddings quickly on a CPU with limited memory. Which Sentence-BERT model should you choose?
Attempts:
2 left
💡 Hint
Smaller models with fewer layers run faster on CPU.
✗ Incorrect
'all-MiniLM-L6-v2' is designed to be lightweight and fast, making it ideal for CPU usage with limited memory.
❓ Hyperparameter
advanced1:30remaining
Which parameter affects the batch size during Sentence-BERT encoding?
When calling the encode() method of a Sentence-BERT model, which parameter controls how many sentences are processed at once?
Attempts:
2 left
💡 Hint
This parameter helps balance speed and memory usage.
✗ Incorrect
The 'batch_size' parameter sets how many sentences are encoded in one go, affecting speed and memory.
❓ Metrics
advanced1:30remaining
What metric is commonly used to evaluate Sentence-BERT embeddings on semantic textual similarity tasks?
Which metric best measures how well Sentence-BERT embeddings capture sentence similarity on datasets like STS Benchmark?
Attempts:
2 left
💡 Hint
This metric measures rank correlation between predicted and true similarity scores.
✗ Incorrect
Spearman's rho on cosine similarity scores is standard for evaluating semantic textual similarity.
🔧 Debug
expert2:00remaining
Why does this Sentence-BERT encoding code raise a RuntimeError?
Consider this code snippet:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = ['Hello world'] * 10000
embeddings = model.encode(sentences, device='cuda')
It raises a RuntimeError: CUDA out of memory. What is the best way to fix this?
Attempts:
2 left
💡 Hint
Large batches can exceed GPU memory limits.
✗ Incorrect
Reducing batch_size lowers memory usage per batch, preventing CUDA out of memory errors.