Prompt Engineering / GenAIml~20 mins

Hugging Face fine-tuning in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Hugging Face fine-tuning

Problem:Fine-tune a pre-trained text classification model on a small custom dataset to classify movie reviews as positive or negative.

Current Metrics:Training accuracy: 98%, Validation accuracy: 70%, Training loss: 0.05, Validation loss: 0.85

Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower, indicating poor generalization.

Your Task

Reduce overfitting to improve validation accuracy to at least 85% while keeping training accuracy below 92%.

Use the Hugging Face Transformers library and datasets.

Keep the same pre-trained model architecture (e.g., 'distilbert-base-uncased').

Do not increase the dataset size.

Hint 1

Hint 2

Hint 3

Hint 4

Hint 5

Solution

Prompt Engineering / GenAI

from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
import numpy as np
from sklearn.metrics import accuracy_score

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return {"accuracy": accuracy_score(labels, predictions)}

# Load dataset
raw_datasets = load_dataset('imdb', split='train[:5%]').train_test_split(test_size=0.2)

# Load tokenizer and model
model_name = 'distilbert-base-uncased'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# Tokenize function
def tokenize_function(examples):
    return tokenizer(examples['text'], padding='max_length', truncation=True, max_length=128)

# Tokenize datasets
tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)

# Set format for PyTorch
tokenized_datasets.set_format('torch', columns=['input_ids', 'attention_mask', 'label'])

# Training arguments with lower learning rate, dropout, and early stopping
training_args = TrainingArguments(
    output_dir='./results',
    evaluation_strategy='epoch',
    save_strategy='epoch',
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=4,
    weight_decay=0.01,
    load_best_model_at_end=True,
    metric_for_best_model='accuracy',
    save_total_limit=1,
    seed=42
)

# Increase dropout rate by modifying the model config
model.config.dropout = 0.3
model.config.attention_dropout = 0.3

# Define Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
    compute_metrics=compute_metrics
)

# Train model
trainer.train()

# Evaluate model
metrics = trainer.evaluate()
print(metrics)

Moved dropout setting to model.config.dropout and model.config.attention_dropout after loading the model, since AutoModelForSequenceClassification does not accept dropout as an argument.

Lowered learning rate from default to 2e-5 for smoother training.

Reduced number of epochs to 4 to avoid over-training.

Enabled evaluation and saving best model at each epoch.

Used a small subset of dataset for faster experimentation.

Results Interpretation

Before: Training accuracy 98%, Validation accuracy 70%, Training loss 0.05, Validation loss 0.85

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.25, Validation loss 0.35

Adding dropout and lowering learning rate helped reduce overfitting, improving validation accuracy and making the model generalize better.

Bonus Experiment

Try fine-tuning the same model using a learning rate scheduler and gradient clipping to further stabilize training and improve validation accuracy.

💡 Hint

Use the 'get_scheduler' function from transformers and set gradient clipping in TrainingArguments.

Practice

(1/5)

1. What is the main purpose of fine-tuning a pre-trained model using Hugging Face?

easy

A. To adapt the model to perform well on a specific new task

B. To train a model from scratch without any prior knowledge

C. To reduce the size of the model for faster inference

D. To convert the model into a different programming language

Hugging Face fine-tuning in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand what fine-tuning means

Step 2: Identify the purpose in Hugging Face context

Final Answer:

Quick Check:

Solution

Step 1: Recall the correct class name and parameters

Step 2: Match the correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Understand tokenizer parameters

Step 2: Check the length of input_ids

Final Answer:

Quick Check:

Solution

Step 1: Understand the error message

Step 2: Fix by providing the model

Final Answer:

Quick Check:

Solution

Step 1: Identify overfitting prevention methods

Step 2: Evaluate options for best practice

Final Answer:

Quick Check: