Practice

(1/5)

1. What is the main purpose of creating a custom evaluation metric in Langchain?

easy

A. To speed up the AI model training process

B. To measure AI results in a way that fits your specific needs

C. To automatically fix errors in AI outputs

D. To replace the AI model with a simpler one

Solution

Step 1: Understand the role of evaluation metrics
Evaluation metrics measure how well an AI model performs its task.
Step 2: Identify why custom metrics are used
Custom metrics let you measure results in ways that standard metrics might not cover, fitting your unique needs.
Final Answer:
To measure AI results in a way that fits your specific needs -> Option B
Quick Check:
Custom metrics = tailored measurement [OK]

Hint: Custom metrics tailor scoring to your AI task [OK]

Common Mistakes:

Thinking custom metrics speed training
Believing they fix AI errors automatically
Confusing metrics with model replacement

2. Which of the following is the correct way to start defining a custom evaluation metric class in Langchain?

easy

A. class MyMetric(Evaluation):

B. def MyMetric():

C. class MyMetric():

D. function MyMetric extends Evaluation {}

Solution

Step 1: Recall Langchain class inheritance syntax
Custom metrics inherit from the Evaluation base class using Python class syntax.
Step 2: Identify correct class definition
class MyMetric(Evaluation): correctly defines a class inheriting from Evaluation, matching Langchain patterns.
Final Answer:
class MyMetric(Evaluation): -> Option A
Quick Check:
Class inherits Evaluation = correct syntax [OK]

Hint: Use class inheritance with Evaluation base [OK]

Common Mistakes:

Defining a function instead of a class
Missing inheritance from Evaluation
Using JavaScript syntax in Python

3. Given this custom metric class, what will metric.evaluate(['hello'], ['hello']) return?

class ExactMatch(Evaluation):
    def evaluate(self, predictions, references):
        return sum(p == r for p, r in zip(predictions, references)) / len(references)

medium

A. 1.0

B. 0.0

C. Error due to missing method

D. None

Solution

Step 1: Understand the evaluate method logic
It compares each prediction to the reference and counts matches, then divides by total references.
Step 2: Apply inputs to the method
With predictions=['hello'] and references=['hello'], the single pair matches, so sum is 1 and length is 1, result is 1/1 = 1.0.
Final Answer:
1.0 -> Option A
Quick Check:
Exact match count / total = 1.0 [OK]

Hint: Check if predictions equal references, then divide [OK]

Common Mistakes:

Forgetting to divide by length
Confusing sum with boolean values
Expecting method to return a list

4. What is wrong with this custom metric class that causes an error?

class LengthDiff(Evaluation):
    def evaluate(self, predictions, references):
        return abs(len(predictions) - len(references)) / len(references)

medium

A. It returns a number instead of a score between 0 and 1

B. It does not implement the evaluate method

C. It uses abs() incorrectly causing a syntax error

D. It does not handle empty lists causing runtime error

Solution

Step 1: Analyze the evaluate method with empty references
If references=[], len(references)=0 causes ZeroDivisionError in the division.
Step 2: Identify the runtime error cause
The code divides by len(references) without checking if references is empty, causing runtime error.
Final Answer:
It does not handle empty lists causing runtime error -> Option D
Quick Check:
len(references)==0 -> ZeroDivisionError [OK]

Hint: Check how method handles empty input lists [OK]

Common Mistakes:

Assuming abs() causes syntax error
Thinking evaluate method is missing
Ignoring empty list edge cases

5. You want to create a custom metric that scores AI answers higher if they contain more keywords from a reference list. Which approach fits best?

hard

A. Calculate the difference in length between prediction and reference

B. Check if prediction exactly matches the reference string

C. Count how many keywords appear in the prediction, divide by total keywords

D. Return a fixed score regardless of prediction content

Solution

Step 1: Understand the goal of keyword-based scoring
The metric should reward predictions containing more keywords from the reference list.
Step 2: Identify the approach that measures keyword presence proportionally
Counting keywords in prediction and dividing by total keywords gives a score reflecting keyword coverage.
Final Answer:
Count how many keywords appear in the prediction, divide by total keywords -> Option C
Quick Check:
Keyword coverage scoring = Count how many keywords appear in the prediction, divide by total keywords [OK]

Hint: Score by keyword matches divided by total keywords [OK]

Common Mistakes:

Using exact match instead of keyword count
Measuring length difference unrelated to keywords
Returning fixed scores ignoring content

Why Custom evaluation metrics in LangChain? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of evaluation metrics

Step 2: Identify why custom metrics are used

Final Answer:

Quick Check:

Solution

Step 1: Recall Langchain class inheritance syntax

Step 2: Identify correct class definition

Final Answer:

Quick Check:

Solution

Step 1: Understand the evaluate method logic

Step 2: Apply inputs to the method

Final Answer:

Quick Check:

Solution

Step 1: Analyze the evaluate method with empty references

Step 2: Identify the runtime error cause

Final Answer:

Quick Check:

Solution

Step 1: Understand the goal of keyword-based scoring

Step 2: Identify the approach that measures keyword presence proportionally

Final Answer:

Quick Check: