0
0
Prompt Engineering / GenAIml~20 mins

Evaluation of fine-tuned models in Prompt Engineering / GenAI - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Fine-Tuned Model Evaluation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Metrics
intermediate
2:00remaining
Understanding evaluation metrics for fine-tuned classification models

You fine-tuned a text classification model and evaluated it on a test set. The model predicted labels for 100 samples. The confusion matrix is:

[[40, 10], [5, 45]]

What is the accuracy of the model?

A0.90
B0.85
C0.80
D0.75
Attempts:
2 left
💡 Hint

Accuracy = (True Positives + True Negatives) / Total samples

Predict Output
intermediate
2:00remaining
Output of evaluation code for fine-tuned regression model

Consider this Python code evaluating a fine-tuned regression model's predictions:

from sklearn.metrics import mean_squared_error
true = [3.0, -0.5, 2.0, 7.0]
pred = [2.5, 0.0, 2.0, 8.0]
mse = mean_squared_error(true, pred)
print(round(mse, 2))

What is the printed output?

A0.38
B0.50
C0.75
D1.25
Attempts:
2 left
💡 Hint

Mean Squared Error is the average of squared differences between true and predicted values.

Model Choice
advanced
2:00remaining
Choosing the best evaluation metric for imbalanced fine-tuned models

You fine-tuned a model on a dataset where 95% of samples belong to class A and 5% to class B. Which evaluation metric is best to assess the model's performance on the minority class?

AAccuracy
BPrecision
CF1-score
DRecall
Attempts:
2 left
💡 Hint

Consider a metric that balances precision and recall for the minority class.

🔧 Debug
advanced
2:00remaining
Identifying the error in evaluation code for fine-tuned model

What error will this code raise when evaluating a fine-tuned classification model?

from sklearn.metrics import accuracy_score
true_labels = [1, 0, 1, 1]
pred_labels = [1, 0, 0]
acc = accuracy_score(true_labels, pred_labels)
print(acc)
ANo error, prints accuracy
BTypeError: unsupported operand type(s) for +: 'int' and 'str'
CIndexError: list index out of range
DValueError: Found input variables with inconsistent numbers of samples
Attempts:
2 left
💡 Hint

Check if true and predicted label lists have the same length.

🧠 Conceptual
expert
2:00remaining
Impact of fine-tuning on model evaluation metrics

After fine-tuning a pre-trained language model on a small dataset, you observe that the training accuracy is very high but the validation accuracy is low. What is the most likely explanation?

AThe model is overfitting the training data
BThe model is underfitting the training data
CThe validation data is corrupted
DThe learning rate is too low
Attempts:
2 left
💡 Hint

Think about what happens when a model learns training data too well but fails on new data.