Challenge - 5 Problems

🎖️

Sentence-BERT Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

1:30remaining

What is the shape of the embeddings produced by Sentence-BERT?

Given the following code snippet that uses Sentence-BERT to encode a list of sentences, what is the shape of the resulting embeddings array?

NLP

from sentence_transformers import SentenceTransformer
sentences = ['I love machine learning.', 'Sentence-BERT creates embeddings.']
model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings.shape)

A(2, 384)

B(384, 2)

C(2, 768)

D(768, 2)

Attempts:

2 left

❓ Model Choice

intermediate

1:30remaining

Which Sentence-BERT model is best for fast embedding generation on CPU?

You want to generate sentence embeddings quickly on a CPU with limited memory. Which Sentence-BERT model should you choose?

A'bert-large-nli-stsb-mean-tokens' - large and accurate model

B'distilbert-base-nli-stsb-mean-tokens' - medium speed model

C'all-MiniLM-L6-v2' - small and fast model

D'roberta-base-nli-stsb-mean-tokens' - medium size model

Attempts:

2 left

❓ Hyperparameter

advanced

1:30remaining

Which parameter affects the batch size during Sentence-BERT encoding?

When calling the encode() method of a Sentence-BERT model, which parameter controls how many sentences are processed at once?

Anum_workers

Bmax_length

Cdevice

Dbatch_size

Attempts:

2 left

❓ Metrics

advanced

1:30remaining

What metric is commonly used to evaluate Sentence-BERT embeddings on semantic textual similarity tasks?

Which metric best measures how well Sentence-BERT embeddings capture sentence similarity on datasets like STS Benchmark?

AF1 score

BCosine similarity correlation (Spearman's rho)

CAccuracy

DMean squared error

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Why does this Sentence-BERT encoding code raise a RuntimeError?

Consider this code snippet: from sentence_transformers import SentenceTransformer model = SentenceTransformer('all-MiniLM-L6-v2') sentences = ['Hello world'] * 10000 embeddings = model.encode(sentences, device='cuda') It raises a RuntimeError: CUDA out of memory. What is the best way to fix this?

AReduce batch_size in encode() to a smaller number like 32

BRemove device='cuda' to run on CPU instead

CIncrease the number of sentences processed at once

DUse a larger GPU with more memory

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of Sentence-BERT embeddings in NLP?

easy

A. To count the number of words in a sentence

B. To translate sentences into different languages

C. To generate random sentences for data augmentation

D. To convert sentences into numbers that capture their meaning

Sentence-BERT for embeddings in NLP - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand Sentence-BERT's role

Step 2: Compare options with Sentence-BERT's function

Final Answer:

Quick Check:

Solution

Step 1: Recall correct import and model loading syntax

Step 2: Check each option for correctness

Final Answer:

Quick Check:

Solution

Step 1: Understand input and output of model.encode()

Step 2: Know embedding dimension of 'all-MiniLM-L6-v2'

Final Answer:

Quick Check:

Solution

Step 1: Analyze the error message

Step 2: Understand correct usage

Final Answer:

Quick Check:

Solution

Step 1: Understand how Sentence-BERT embeddings are used for similarity

Step 2: Evaluate options for similarity search

Final Answer:

Quick Check: