0
0
LangChainframework~10 mins

Custom evaluation metrics in LangChain - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to define a custom evaluation metric function that returns accuracy.

LangChain
def custom_metric(predictions, references):
    correct = sum(p == r for p, r in zip(predictions, references))
    total = len(predictions)
    return correct [1] total
Drag options to blanks, or click blank then click option'
A-
B*
C/
D+
Attempts:
3 left
💡 Hint
Common Mistakes
Using multiplication instead of division.
Subtracting instead of dividing.
Adding counts instead of dividing.
2fill in blank
medium

Complete the code to register the custom metric in LangChain's evaluation framework.

LangChain
from langchain.evaluation import Evaluation

eval = Evaluation()
eval.register_metric('accuracy', [1])
Drag options to blanks, or click blank then click option'
Acustom_metric()
Bcustom_metric()()
C'custom_metric'
Dcustom_metric
Attempts:
3 left
💡 Hint
Common Mistakes
Calling the function instead of passing it.
Passing the function name as a string.
Adding extra parentheses.
3fill in blank
hard

Fix the error in the custom metric function to handle empty prediction lists safely.

LangChain
def custom_metric(predictions, references):
    if len(predictions) == 0:
        return 0
    correct = sum(p == r for p, r in zip(predictions, references))
    return correct [1] len(predictions)
Drag options to blanks, or click blank then click option'
A/
B-
C+
D*
Attempts:
3 left
💡 Hint
Common Mistakes
Using multiplication instead of division.
Not handling empty lists causing errors.
Returning incorrect calculations.
4fill in blank
hard

Fill both blanks to create a custom metric that calculates F1 score using precision and recall.

LangChain
def f1_score(precision, recall):
    return 2 * (precision [1] recall) [2] (precision + recall)
Drag options to blanks, or click blank then click option'
A*
B+
C/
D-
Attempts:
3 left
💡 Hint
Common Mistakes
Using addition instead of multiplication.
Dividing by difference instead of sum.
Incorrect order of operations.
5fill in blank
hard

Fill all three blanks to create a dictionary comprehension that maps each label to its F1 score if recall is above 0.5.

LangChain
f1_scores = {label: f1_score(precision[label], recall[label]) for label in labels if recall[label] [1] [2]
filtered_scores = {k: v for k, v in f1_scores.items() if v [3] 0.7}
Drag options to blanks, or click blank then click option'
A>
B0.5
C>=
D0.7
Attempts:
3 left
💡 Hint
Common Mistakes
Using wrong comparison operators.
Mixing up threshold values.
Incorrect dictionary comprehension syntax.