0
0
Prompt Engineering / GenAIml~10 mins

Automated evaluation metrics in Prompt Engineering / GenAI - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to calculate accuracy from predictions and labels.

Prompt Engineering / GenAI
accuracy = sum(predictions == [1]) / len(predictions)
Drag options to blanks, or click blank then click option'
Atargets
Bpredictions
Coutputs
Dlabels
Attempts:
3 left
💡 Hint
Common Mistakes
Using predictions instead of labels for comparison
Dividing by wrong length
2fill in blank
medium

Complete the code to compute precision score using sklearn.

Prompt Engineering / GenAI
precision = precision_score([1], predictions)
Drag options to blanks, or click blank then click option'
Atargets
Blabels
Coutputs
Dpredictions
Attempts:
3 left
💡 Hint
Common Mistakes
Swapping predictions and labels
Using wrong variable names
3fill in blank
hard

Fix the error in computing F1 score by filling the missing argument.

Prompt Engineering / GenAI
f1 = f1_score(labels, predictions, average=[1])
Drag options to blanks, or click blank then click option'
A'binary'
B'micro'
C'weighted'
D'macro'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'macro' or 'micro' for binary tasks
Omitting the average argument
4fill in blank
hard

Fill both blanks to create a dictionary of recall scores for each class.

Prompt Engineering / GenAI
recall_scores = {cls: recall_score(labels, predictions, average=[1], labels=[cls]) for cls in [2]
Drag options to blanks, or click blank then click option'
A'binary'
B'macro'
C'weighted'
Dclasses
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'macro' average which averages all classes
Not iterating over class labels
5fill in blank
hard

Fill all three blanks to compute a confusion matrix and extract true positives.

Prompt Engineering / GenAI
cm = confusion_matrix([1], [2])
true_positives = cm[[3], [3]]
Drag options to blanks, or click blank then click option'
Alabels
Bpredictions
C1
D0
Attempts:
3 left
💡 Hint
Common Mistakes
Swapping labels and predictions
Using wrong index for true positives