0
0
Prompt Engineering / GenAIml~10 mins

RAG evaluation metrics in Prompt Engineering / GenAI - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to calculate the recall score for RAG model predictions.

Prompt Engineering / GenAI
from sklearn.metrics import [1]
recall = [1](true_labels, predicted_labels)
Drag options to blanks, or click blank then click option'
Af1_score
Bprecision_score
Caccuracy_score
Drecall_score
Attempts:
3 left
💡 Hint
Common Mistakes
Using precision_score instead of recall_score
Using accuracy_score which is less informative for retrieval tasks
2fill in blank
medium

Complete the code to compute the F1 score for RAG model outputs.

Prompt Engineering / GenAI
from sklearn.metrics import [1]
f1 = [1](true_labels, predicted_labels)
Drag options to blanks, or click blank then click option'
Af1_score
Baccuracy_score
Crecall_score
Dprecision_score
Attempts:
3 left
💡 Hint
Common Mistakes
Using accuracy_score which does not balance precision and recall
Confusing precision_score with f1_score
3fill in blank
hard

Fix the error in the code to calculate precision for RAG evaluation.

Prompt Engineering / GenAI
from sklearn.metrics import precision_score
precision = precision_score(true_labels, [1])
Drag options to blanks, or click blank then click option'
Apredicted_labels
Btrue_labels
Cpredictions
Dlabels
Attempts:
3 left
💡 Hint
Common Mistakes
Passing true_labels twice
Using an undefined variable like 'labels'
4fill in blank
hard

Fill both blanks to create a dictionary of evaluation metrics for RAG.

Prompt Engineering / GenAI
metrics = {
    'precision': [1](true_labels, predicted_labels),
    'recall': [2](true_labels, predicted_labels)
}
Drag options to blanks, or click blank then click option'
Aprecision_score
Brecall_score
Cf1_score
Daccuracy_score
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing up precision_score and recall_score
Using accuracy_score which is less relevant here
5fill in blank
hard

Fill all three blanks to compute precision, recall, and F1 score for RAG evaluation.

Prompt Engineering / GenAI
results = {
    'precision': [1](true_labels, predicted_labels),
    'recall': [2](true_labels, predicted_labels),
    'f1': [3](true_labels, predicted_labels)
}
Drag options to blanks, or click blank then click option'
Aprecision_score
Brecall_score
Cf1_score
Daccuracy_score
Attempts:
3 left
💡 Hint
Common Mistakes
Using accuracy_score instead of f1_score
Swapping recall_score and precision_score