Bird
Raised Fist0
Prompt Engineering / GenAIml~8 mins

Pre-training and fine-tuning concept in Prompt Engineering / GenAI - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Pre-training and fine-tuning concept
Which metric matters and WHY

For pre-training and fine-tuning, the key metrics depend on the task the model is fine-tuned for. Common metrics include accuracy for classification, loss for general learning progress, and task-specific metrics like BLEU for language generation or F1 score for imbalanced classes.

During pre-training, loss (like cross-entropy) is important to see if the model is learning general patterns. During fine-tuning, task-specific metrics matter more because they show how well the model adapts to the new task.

Confusion matrix example

Imagine fine-tuning a model for spam detection. Here is a confusion matrix from the fine-tuned model:

      | Predicted Spam | Predicted Not Spam |
      |----------------|--------------------|
      | True Positives (TP) = 90  | False Positives (FP) = 15 |
      | False Negatives (FN) = 10 | True Negatives (TN) = 85  |
    

Total samples = 90 + 10 + 15 + 85 = 200

From this, precision = 90 / (90 + 15) = 0.857, recall = 90 / (90 + 10) = 0.9

Precision vs Recall tradeoff with examples

When fine-tuning, you often balance precision and recall depending on the task:

  • High precision: Important when false alarms are costly. For example, in spam detection, you want to avoid marking good emails as spam.
  • High recall: Important when missing positive cases is costly. For example, in medical diagnosis, you want to catch as many sick patients as possible.

Fine-tuning helps adjust the model to this balance by training on task-specific data.

What good vs bad metric values look like

For a fine-tuned model on a balanced classification task:

  • Good: Accuracy above 85%, precision and recall above 80%, loss steadily decreasing.
  • Bad: Accuracy near random chance (e.g., 50% for binary), precision or recall very low (below 50%), loss not improving or increasing.

Good metrics mean the model learned useful features during pre-training and adapted well during fine-tuning.

Common pitfalls in metrics
  • Accuracy paradox: High accuracy can be misleading if classes are imbalanced. For example, 95% accuracy on 95% negative data means the model ignores positives.
  • Data leakage: If fine-tuning data leaks test data, metrics look unrealistically good.
  • Overfitting: Very low training loss but poor test metrics means the model memorized training data and did not generalize.
  • Ignoring task metrics: Using only pre-training loss to judge fine-tuning success can be misleading.
Self-check question

Your fine-tuned model has 98% accuracy but only 12% recall on fraud detection. Is it good for production? Why or why not?

Answer: No, it is not good. The low recall means the model misses most fraud cases, which is dangerous. High accuracy is misleading because fraud cases are rare. You need to improve recall to catch more fraud.

Key Result
Pre-training loss shows general learning; fine-tuning metrics like precision and recall reveal task-specific performance and tradeoffs.

Practice

(1/5)
1. What is the main purpose of pre-training in machine learning models?
easy
A. To delete unnecessary data from the model
B. To adjust the model for a specific task
C. To evaluate the model's performance
D. To teach the model general knowledge from large data

Solution

  1. Step 1: Understand pre-training role

    Pre-training is done on large datasets to help the model learn general patterns and knowledge.
  2. Step 2: Differentiate from fine-tuning

    Fine-tuning is the step where the model is adapted to a specific task, not the initial general learning.
  3. Final Answer:

    To teach the model general knowledge from large data -> Option D
  4. Quick Check:

    Pre-training = General knowledge learning [OK]
Hint: Pre-training = general learning, fine-tuning = task-specific [OK]
Common Mistakes:
  • Confusing pre-training with fine-tuning
  • Thinking pre-training is for evaluation
  • Assuming pre-training deletes data
2. Which of the following is the correct way to describe fine-tuning?
easy
A. Adjusting a pre-trained model to perform a specific task
B. Removing layers from a neural network
C. Training a model from scratch on a small dataset
D. Collecting data for training

Solution

  1. Step 1: Define fine-tuning

    Fine-tuning means taking a model already trained on general data and adjusting it for a specific task.
  2. Step 2: Eliminate incorrect options

    Training from scratch is not fine-tuning; removing layers or collecting data are unrelated to fine-tuning.
  3. Final Answer:

    Adjusting a pre-trained model to perform a specific task -> Option A
  4. Quick Check:

    Fine-tuning = adapt pre-trained model [OK]
Hint: Fine-tuning = adapt model, not train from scratch [OK]
Common Mistakes:
  • Confusing fine-tuning with training from scratch
  • Thinking fine-tuning means changing model structure
  • Mixing data collection with fine-tuning
3. Consider this Python-like pseudocode for fine-tuning a pre-trained model:
model = load_pretrained_model()
model.train(specific_task_data)
predictions = model.predict(test_data)
print(predictions)

What is the expected output of print(predictions)?
medium
A. Random values unrelated to the task
B. Predictions based on the specific task after fine-tuning
C. Error because model is not trained
D. Predictions from the original pre-trained model without changes

Solution

  1. Step 1: Understand the code flow

    The model is loaded pre-trained, then trained on specific task data (fine-tuning), then used to predict.
  2. Step 2: Predict output meaning

    After fine-tuning, predictions reflect the model adapted to the specific task, not random or original outputs.
  3. Final Answer:

    Predictions based on the specific task after fine-tuning -> Option B
  4. Quick Check:

    Fine-tuned model predicts task data [OK]
Hint: Fine-tuned model predicts task-specific outputs [OK]
Common Mistakes:
  • Assuming predictions are random
  • Thinking model is untrained
  • Ignoring fine-tuning effect on predictions
4. You try to fine-tune a pre-trained model but get an error: AttributeError: 'NoneType' object has no attribute 'train'. What is the most likely cause?
medium
A. The pre-trained model failed to load, returning None
B. The training data is empty
C. The model is already fine-tuned
D. The prediction method is called before training

Solution

  1. Step 1: Analyze the error message

    The error says 'NoneType' has no attribute 'train', meaning the model variable is None, not a model object.
  2. Step 2: Identify cause of None

    This usually happens if loading the pre-trained model failed and returned None instead of a model.
  3. Final Answer:

    The pre-trained model failed to load, returning None -> Option A
  4. Quick Check:

    None model means load failure [OK]
Hint: Check if model loaded correctly before training [OK]
Common Mistakes:
  • Blaming empty training data for this error
  • Assuming model is already fine-tuned
  • Confusing training and prediction order
5. You have a large language model pre-trained on general text. You want to create a chatbot for medical advice. Which approach best uses pre-training and fine-tuning?
hard
A. Use the pre-trained model without any changes
B. Train a new model from scratch only on medical texts
C. Fine-tune the pre-trained model on medical conversation data
D. Pre-train the model again on medical texts before fine-tuning

Solution

  1. Step 1: Understand the goal

    The goal is to adapt a general language model to a specific medical chatbot task.
  2. Step 2: Evaluate options

    Training from scratch is costly and slow; using the model without changes lacks medical knowledge; re-pre-training is unnecessary and expensive.
  3. Step 3: Choose best approach

    Fine-tuning the pre-trained model on medical conversation data efficiently adapts it to the task.
  4. Final Answer:

    Fine-tune the pre-trained model on medical conversation data -> Option C
  5. Quick Check:

    Fine-tuning adapts general model to specific task [OK]
Hint: Fine-tune pre-trained model for specific tasks [OK]
Common Mistakes:
  • Training from scratch wastes resources
  • Using pre-trained model without fine-tuning misses task needs
  • Re-pre-training is redundant and costly