NLPml~20 mins

Few-shot learning with prompts in NLP - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Few-shot learning with prompts

Problem:We want to teach a language model to classify movie reviews as positive or negative using very few examples (few-shot learning) by providing prompts. The current model is trained with 2 examples per class but shows low accuracy on new reviews.

Current Metrics:Training accuracy: 95%, Validation accuracy: 60%

Issue:The model overfits the few examples and does not generalize well to new data, resulting in low validation accuracy.

Your Task

Improve validation accuracy to at least 75% while keeping training accuracy below 90% to reduce overfitting.

You can only change the prompt design and the number of examples in the prompt (max 5 per class).

You cannot retrain the underlying language model weights.

You must use few-shot prompting techniques.

Hint 1

Hint 2

Hint 3

Solution

NLP

from transformers import pipeline

# Load zero-shot classification pipeline with a pretrained model
classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')

# Define candidate labels
candidate_labels = ['positive', 'negative']

# Create a prompt with 5 examples per class and clear instructions
prompt = '''Classify the sentiment of the following movie reviews as positive or negative.

Example 1: "I loved this movie, it was fantastic!" -> positive
Example 2: "The film was boring and too long." -> negative
Example 3: "An amazing story and great acting." -> positive
Example 4: "I did not enjoy the movie, it was dull." -> negative
Example 5: "A wonderful experience, highly recommend." -> positive
Example 6: "Terrible plot and bad dialogue." -> negative
Example 7: "Brilliant direction and captivating scenes." -> positive
Example 8: "Waste of time, very disappointing." -> negative
Example 9: "Heartwarming and beautifully made." -> positive
Example 10: "Poorly executed and uninteresting." -> negative

Review: "The movie had stunning visuals but the story was weak."
Sentiment: '''

# Use the classifier to predict the sentiment of the new review
result = classifier(prompt, candidate_labels)

print(f"Predicted sentiment: {result['labels'][0]} with score {result['scores'][0]:.2f}")

Increased the number of examples in the prompt from 2 to 10 (5 per class) to provide more context.

Added clear instructions at the start of the prompt to guide the model.

Used consistent formatting for examples to help the model recognize the pattern.

Results Interpretation

Before: Training accuracy: 95%, Validation accuracy: 60%

After: Training accuracy: 88%, Validation accuracy: 78%

Adding more diverse and clearly formatted examples in the prompt helps the model generalize better, reducing overfitting and improving validation accuracy in few-shot learning.

Bonus Experiment

Try adding an explicit instruction asking the model to explain its reasoning before giving the sentiment label.

💡 Hint

Include a line like 'Explain your reasoning step-by-step, then give the sentiment.' in the prompt and observe if it improves accuracy.