Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Pre-training and fine-tuning concept in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Pre-training and fine-tuning concept
Problem:You have a language model pre-trained on a large text dataset. Now you want to adapt it to classify movie reviews as positive or negative.
Current Metrics:Training accuracy: 98%, Validation accuracy: 70%
Issue:The model overfits the training data and performs poorly on validation data.
Your Task
Reduce overfitting and improve validation accuracy to at least 85% while keeping training accuracy below 92%.
You can only adjust fine-tuning parameters and add regularization techniques.
You cannot change the pre-trained model architecture or dataset.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Assume pre_trained_model is loaded
# Freeze base model layers
for layer in pre_trained_model.layers:
    layer.trainable = False

# Add classification head
x = pre_trained_model.output
x = Dropout(0.3)(x)
x = Dense(64, activation='relu')(x)
x = Dropout(0.3)(x)
outputs = Dense(1, activation='sigmoid')(x)

model = Model(inputs=pre_trained_model.input, outputs=outputs)

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5),
              loss='binary_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(train_data, train_labels,
                    epochs=20,
                    batch_size=32,
                    validation_data=(val_data, val_labels),
                    callbacks=[early_stop])
Froze the pre-trained model layers to keep learned features stable.
Added dropout layers with 30% rate to reduce overfitting.
Lowered learning rate to 1e-5 for smoother fine-tuning.
Used early stopping to stop training when validation loss stops improving.
Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 70% (overfitting)

After: Training accuracy: 90%, Validation accuracy: 87% (better generalization)

Freezing pre-trained layers and adding dropout with a lower learning rate helps reduce overfitting and improves validation accuracy during fine-tuning.
Bonus Experiment
Try unfreezing some of the last layers of the pre-trained model and fine-tune them with a very low learning rate to see if validation accuracy improves further.
💡 Hint
Unfreeze only the last 2-3 layers and keep the rest frozen. Use a learning rate around 1e-6.

Practice

(1/5)
1. What is the main purpose of pre-training in machine learning models?
easy
A. To delete unnecessary data from the model
B. To adjust the model for a specific task
C. To evaluate the model's performance
D. To teach the model general knowledge from large data

Solution

  1. Step 1: Understand pre-training role

    Pre-training is done on large datasets to help the model learn general patterns and knowledge.
  2. Step 2: Differentiate from fine-tuning

    Fine-tuning is the step where the model is adapted to a specific task, not the initial general learning.
  3. Final Answer:

    To teach the model general knowledge from large data -> Option D
  4. Quick Check:

    Pre-training = General knowledge learning [OK]
Hint: Pre-training = general learning, fine-tuning = task-specific [OK]
Common Mistakes:
  • Confusing pre-training with fine-tuning
  • Thinking pre-training is for evaluation
  • Assuming pre-training deletes data
2. Which of the following is the correct way to describe fine-tuning?
easy
A. Adjusting a pre-trained model to perform a specific task
B. Removing layers from a neural network
C. Training a model from scratch on a small dataset
D. Collecting data for training

Solution

  1. Step 1: Define fine-tuning

    Fine-tuning means taking a model already trained on general data and adjusting it for a specific task.
  2. Step 2: Eliminate incorrect options

    Training from scratch is not fine-tuning; removing layers or collecting data are unrelated to fine-tuning.
  3. Final Answer:

    Adjusting a pre-trained model to perform a specific task -> Option A
  4. Quick Check:

    Fine-tuning = adapt pre-trained model [OK]
Hint: Fine-tuning = adapt model, not train from scratch [OK]
Common Mistakes:
  • Confusing fine-tuning with training from scratch
  • Thinking fine-tuning means changing model structure
  • Mixing data collection with fine-tuning
3. Consider this Python-like pseudocode for fine-tuning a pre-trained model:
model = load_pretrained_model()
model.train(specific_task_data)
predictions = model.predict(test_data)
print(predictions)

What is the expected output of print(predictions)?
medium
A. Random values unrelated to the task
B. Predictions based on the specific task after fine-tuning
C. Error because model is not trained
D. Predictions from the original pre-trained model without changes

Solution

  1. Step 1: Understand the code flow

    The model is loaded pre-trained, then trained on specific task data (fine-tuning), then used to predict.
  2. Step 2: Predict output meaning

    After fine-tuning, predictions reflect the model adapted to the specific task, not random or original outputs.
  3. Final Answer:

    Predictions based on the specific task after fine-tuning -> Option B
  4. Quick Check:

    Fine-tuned model predicts task data [OK]
Hint: Fine-tuned model predicts task-specific outputs [OK]
Common Mistakes:
  • Assuming predictions are random
  • Thinking model is untrained
  • Ignoring fine-tuning effect on predictions
4. You try to fine-tune a pre-trained model but get an error: AttributeError: 'NoneType' object has no attribute 'train'. What is the most likely cause?
medium
A. The pre-trained model failed to load, returning None
B. The training data is empty
C. The model is already fine-tuned
D. The prediction method is called before training

Solution

  1. Step 1: Analyze the error message

    The error says 'NoneType' has no attribute 'train', meaning the model variable is None, not a model object.
  2. Step 2: Identify cause of None

    This usually happens if loading the pre-trained model failed and returned None instead of a model.
  3. Final Answer:

    The pre-trained model failed to load, returning None -> Option A
  4. Quick Check:

    None model means load failure [OK]
Hint: Check if model loaded correctly before training [OK]
Common Mistakes:
  • Blaming empty training data for this error
  • Assuming model is already fine-tuned
  • Confusing training and prediction order
5. You have a large language model pre-trained on general text. You want to create a chatbot for medical advice. Which approach best uses pre-training and fine-tuning?
hard
A. Use the pre-trained model without any changes
B. Train a new model from scratch only on medical texts
C. Fine-tune the pre-trained model on medical conversation data
D. Pre-train the model again on medical texts before fine-tuning

Solution

  1. Step 1: Understand the goal

    The goal is to adapt a general language model to a specific medical chatbot task.
  2. Step 2: Evaluate options

    Training from scratch is costly and slow; using the model without changes lacks medical knowledge; re-pre-training is unnecessary and expensive.
  3. Step 3: Choose best approach

    Fine-tuning the pre-trained model on medical conversation data efficiently adapts it to the task.
  4. Final Answer:

    Fine-tune the pre-trained model on medical conversation data -> Option C
  5. Quick Check:

    Fine-tuning adapts general model to specific task [OK]
Hint: Fine-tune pre-trained model for specific tasks [OK]
Common Mistakes:
  • Training from scratch wastes resources
  • Using pre-trained model without fine-tuning misses task needs
  • Re-pre-training is redundant and costly