LangChainframework~20 mins

Custom evaluation metrics in LangChain - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Challenge - 5 Problems

🎖️

Custom Evaluation Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ component_behavior

intermediate

2:00remaining

What output does this custom metric function produce?

Consider this Python function used as a custom evaluation metric in Langchain:

def custom_metric(predictions, references):
    correct = sum(p == r for p, r in zip(predictions, references))
    total = len(references)
    return correct / total if total > 0 else 0

What is the output of custom_metric(['a', 'b', 'c'], ['a', 'x', 'c'])?

LangChain

def custom_metric(predictions, references):
    correct = sum(p == r for p, r in zip(predictions, references))
    total = len(references)
    return correct / total if total > 0 else 0

result = custom_metric(['a', 'b', 'c'], ['a', 'x', 'c'])

B0.3333333333333333

C0.6666666666666666

D1.0

Attempts:

2 left

📝 Syntax

intermediate

2:00remaining

Which option correctly defines a custom metric function in Langchain?

Which of the following Python functions correctly defines a custom evaluation metric that returns the ratio of matching items between predictions and references?

def metric(predictions, references):
    return sum(p == r for p, r in zip(predictions, references)) * len(references)

def metric(predictions, references):
    return sum(p == r for p in predictions for r in references) / len(references)

def metric(predictions, references):
    return sum(predictions == references) / len(references)

def metric(predictions, references):
    return sum(p == r for p, r in zip(predictions, references)) / len(references)

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

What error does this custom metric code raise?

Given this custom metric function:

def metric(predictions, references):
    return sum(p == r for p, r in zip(predictions, references)) / len(predictions)

What error will occur if predictions is an empty list and references is non-empty?

LangChain

def metric(predictions, references):
    return sum(p == r for p, r in zip(predictions, references)) / len(predictions)

result = metric([], ['a', 'b'])

AZeroDivisionError

BIndexError

CTypeError

DNo error, returns 0

Attempts:

2 left

🧠 Conceptual

advanced

2:00remaining

Why use custom evaluation metrics in Langchain?

Which reason best explains why you might create a custom evaluation metric instead of using built-in ones in Langchain?

ATo measure specific qualities of your model's output that built-in metrics don't capture

BBecause built-in metrics are always inaccurate and unreliable

CBecause Langchain requires custom metrics for all models

DTo make the evaluation run faster by avoiding built-in functions

Attempts:

2 left

❓ state_output

expert

3:00remaining

What is the final value of score after running this custom metric?

Consider this code snippet used in Langchain to evaluate predictions:

class CustomMetric:
    def __init__(self):
        self.correct = 0
        self.total = 0
    def update(self, predictions, references):
        for p, r in zip(predictions, references):
            if p == r:
                self.correct += 1
            self.total += 1
    def compute(self):
        return self.correct / self.total if self.total > 0 else 0

metric = CustomMetric()
metric.update(['a', 'b'], ['a', 'x'])
metric.update(['c'], ['c'])
score = metric.compute()

What is the value of score?

LangChain

class CustomMetric:
    def __init__(self):
        self.correct = 0
        self.total = 0
    def update(self, predictions, references):
        for p, r in zip(predictions, references):
            if p == r:
                self.correct += 1
            self.total += 1
    def compute(self):
        return self.correct / self.total if self.total > 0 else 0

metric = CustomMetric()
metric.update(['a', 'b'], ['a', 'x'])
metric.update(['c'], ['c'])
score = metric.compute()

A0.5

B0.6666666666666666

C1.0

D0.0

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of creating a custom evaluation metric in Langchain?

easy

A. To speed up the AI model training process

B. To measure AI results in a way that fits your specific needs

C. To automatically fix errors in AI outputs

D. To replace the AI model with a simpler one

Custom evaluation metrics in LangChain - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of evaluation metrics

Step 2: Identify why custom metrics are used

Final Answer:

Quick Check:

Solution

Step 1: Recall Langchain class inheritance syntax

Step 2: Identify correct class definition

Final Answer:

Quick Check:

Solution

Step 1: Understand the evaluate method logic

Step 2: Apply inputs to the method

Final Answer:

Quick Check:

Solution

Step 1: Analyze the evaluate method with empty references

Step 2: Identify the runtime error cause

Final Answer:

Quick Check:

Solution

Step 1: Understand the goal of keyword-based scoring

Step 2: Identify the approach that measures keyword presence proportionally

Final Answer:

Quick Check: