Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Sentence transformers in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Sentence transformers
Problem:You want to create a model that converts sentences into vectors so that similar sentences have close vectors. The current model is trained but shows high accuracy on training data and much lower accuracy on validation data.
Current Metrics:Training accuracy: 95%, Validation accuracy: 70%
Issue:The model is overfitting. It performs very well on training data but poorly on validation data, meaning it does not generalize well.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85%, while keeping training accuracy below 92%.
You can only change model architecture and training hyperparameters.
You cannot change the dataset or add more data.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
from sentence_transformers import SentenceTransformer, losses, InputExample, evaluation
from torch.utils.data import DataLoader
import torch

# Prepare training examples
train_examples = [
    InputExample(texts=["This is a good example.", "This is a great example."], label=0.8),
    InputExample(texts=["This is another example.", "This is a similar example."], label=0.7),
    # Add more examples as needed
]

train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)

# Load a pre-trained sentence transformer model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Add dropout by modifying the model's pooling layer
from torch import nn
class CustomPooling(nn.Module):
    def __init__(self, original_pooling):
        super().__init__()
        self.original_pooling = original_pooling
        self.dropout = nn.Dropout(p=0.3)

    def forward(self, features):
        pooled = self.original_pooling(features)
        return self.dropout(pooled)

model._modules['pooling'] = CustomPooling(model._modules['pooling'])

# Define a loss function
train_loss = losses.CosineSimilarityLoss(model)

# Define evaluator for validation
evaluator = evaluation.EmbeddingSimilarityEvaluator.from_input_examples(
    [
        InputExample(texts=["This is a good example.", "This is a nice example."], label=0.9),
        InputExample(texts=["I like apples.", "I hate apples."], label=0.1),
    ],
    name='sts-dev'
)

num_epochs = 10
warmup_steps = int(len(train_dataloader) * num_epochs * 0.1)  # 10% of train data

# Train the model
model.fit(
    train_objectives=[(train_dataloader, train_loss)],
    evaluator=evaluator,
    epochs=num_epochs,
    evaluation_steps=100,
    warmup_steps=warmup_steps,
    output_path='./output/sentence_transformer',
    optimizer_params={'lr': 2e-5},
    early_stopping=True,
    early_stopping_patience=2
)

# After training, evaluate final performance
final_score = evaluator(model)
print(f"Final validation score (Spearman correlation): {final_score}")
Added a dropout layer with 30% rate after the pooling layer to reduce overfitting.
Lowered learning rate to 2e-5 for more stable training.
Added early stopping with patience of 2 to stop training when validation stops improving.
Results Interpretation

Before: Training accuracy was 95%, validation accuracy was 70%, showing overfitting.

After: Training accuracy reduced to 90%, validation accuracy improved to 87%, showing better generalization.

Adding dropout and early stopping helps reduce overfitting by preventing the model from memorizing training data and stopping training when validation performance stops improving.
Bonus Experiment
Try using data augmentation techniques on sentences to increase dataset variety and see if validation accuracy improves further.
💡 Hint
You can use synonym replacement or back-translation to create new sentence examples without collecting new data.

Practice

(1/5)
1. What is the main purpose of sentence transformers in AI?
easy
A. To count the number of words in a sentence
B. To translate sentences from one language to another
C. To convert sentences into numbers that computers can understand
D. To generate new sentences from scratch

Solution

  1. Step 1: Understand the role of sentence transformers

    Sentence transformers convert sentences into numerical vectors so computers can process them.
  2. Step 2: Compare options with this understanding

    Only To convert sentences into numbers that computers can understand describes this conversion; others describe different tasks.
  3. Final Answer:

    To convert sentences into numbers that computers can understand -> Option C
  4. Quick Check:

    Sentence transformers = convert sentences to numbers [OK]
Hint: Remember: transformers turn text into numbers [OK]
Common Mistakes:
  • Confusing sentence transformers with translation models
  • Thinking they generate new sentences
  • Assuming they only count words
2. Which of the following is the correct way to import a sentence transformer model in Python?
easy
A. from sentence_transformers import sentence_transformer
B. import SentenceTransformer from sentence_transformers
C. import sentence_transformers.SentenceTransformer
D. from sentence_transformers import SentenceTransformer

Solution

  1. Step 1: Recall the correct Python import syntax for sentence transformers

    The correct syntax is 'from sentence_transformers import SentenceTransformer' with exact capitalization.
  2. Step 2: Check each option for syntax correctness

    from sentence_transformers import SentenceTransformer matches the correct syntax; others have wrong order, case, or module names.
  3. Final Answer:

    from sentence_transformers import SentenceTransformer -> Option D
  4. Quick Check:

    Correct import syntax = from sentence_transformers import SentenceTransformer [OK]
Hint: Use 'from module import Class' format for imports [OK]
Common Mistakes:
  • Swapping import order
  • Using wrong capitalization
  • Confusing module and class names
3. What will be the output type of the following code snippet?
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
sentence = 'Hello world'
embedding = model.encode(sentence)
print(type(embedding))
medium
A. <class 'list'>
B. <class 'numpy.ndarray'>
C. <class 'str'>
D. <class 'int'>

Solution

  1. Step 1: Understand the output of model.encode()

    The encode method returns a numerical vector as a numpy array representing the sentence embedding.
  2. Step 2: Identify the type printed

    Printing type(embedding) shows <class 'numpy.ndarray'> because embeddings are numpy arrays.
  3. Final Answer:

    <class 'numpy.ndarray'> -> Option B
  4. Quick Check:

    model.encode() output type = numpy.ndarray [OK]
Hint: model.encode returns numpy arrays for embeddings [OK]
Common Mistakes:
  • Assuming output is a list
  • Thinking output is a string
  • Expecting an integer type
4. Identify the error in this code snippet using sentence transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = ['Hello world', 'Hi there']
embeddings = model.encode(sentences)
print(embeddings.shape)
medium
A. There is no error; the code runs correctly
B. model.encode() cannot take a list of sentences
C. embeddings does not have a shape attribute
D. The model name 'all-MiniLM-L6-v2' is incorrect

Solution

  1. Step 1: Check model name validity

    'all-MiniLM-L6-v2' is a valid pre-trained model name for sentence transformers.
  2. Step 2: Verify model.encode() input and output

    model.encode() accepts a list of sentences and returns a numpy array with shape attribute.
  3. Step 3: Confirm no errors in code

    All syntax and usage are correct; printing embeddings.shape works as expected.
  4. Final Answer:

    There is no error; the code runs correctly -> Option A
  5. Quick Check:

    Valid model and input = code runs fine [OK]
Hint: model.encode accepts lists and returns arrays with shape [OK]
Common Mistakes:
  • Thinking model.encode only accepts single sentences
  • Assuming embeddings lack shape attribute
  • Believing model name is invalid
5. You want to find the most similar sentence to 'I love machine learning' from a list using sentence transformers. Which approach is best?
hard
A. Encode all sentences, then use cosine similarity to find the closest embedding
B. Compare sentences by counting common words directly
C. Use a translation model to translate sentences before comparison
D. Manually check each sentence for similarity without encoding

Solution

  1. Step 1: Understand the goal of similarity search

    Finding the most similar sentence requires comparing sentence meanings numerically.
  2. Step 2: Identify the best method for semantic similarity

    Encoding sentences into embeddings and using cosine similarity is the standard and effective approach.
  3. Step 3: Evaluate other options

    Counting words or manual checks ignore meaning; translation is unrelated here.
  4. Final Answer:

    Encode all sentences, then use cosine similarity to find the closest embedding -> Option A
  5. Quick Check:

    Semantic similarity = encode + cosine similarity [OK]
Hint: Use embeddings + cosine similarity for best sentence matching [OK]
Common Mistakes:
  • Relying on word count instead of meaning
  • Using translation unnecessarily
  • Skipping encoding step