What is LangSmith evaluators in LangChain?

LangChainframework~5 mins

LangSmith evaluators in LangChain

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Introduction

LangSmith evaluators help you check how well your language models or chains are working. They give you clear feedback so you can improve your AI tools.

When you want to see if your AI answers are correct or useful.

When you need to compare different AI models to pick the best one.

When you want to track how your AI improves over time.

When you want to automatically grade AI responses in a project.

When you want to get detailed reports about AI performance.

Syntax

LangChain

from langchain.evaluation import load_evaluator

evaluator = load_evaluator("exact_match")
result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0]

You use load_evaluator('exact_match') to create an evaluator object and then call its evaluate_strings method with lists containing the AI's output and the correct answer.

Evaluators can be customized to check different things like accuracy, relevance, or style.

Examples

This example checks if the predicted text exactly matches the reference text.

LangChain

from langchain.evaluation import load_evaluator

evaluator = load_evaluator("exact_match")
prediction = "Hello, world!"
reference = "Hello, world!"
result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0]
print(result)

This example evaluates yes/no answers, useful for boolean-like outputs.

LangChain

from langchain.evaluation import load_evaluator

evaluator = load_evaluator("exact_match")
prediction = "Yes"
reference = "Yes"
result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0]
print(result)

Sample Program

This program uses the exact_match evaluator to check if the AI's answer matches the correct answer exactly. It prints the evaluation score.

LangChain

from langchain.evaluation import load_evaluator

# Create an evaluator for string comparison
evaluator = load_evaluator("exact_match")

# AI's answer
prediction = "The capital of France is Paris."

# Correct answer
reference = "The capital of France is Paris."

# Evaluate the prediction
result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0]

print(f"Evaluation result: {result['score']}")

OutputSuccess

Important Notes

Evaluators help you measure AI quality without guessing.

Choose the right evaluator type for your task to get useful feedback.

Evaluation results can be used to improve your AI models step by step.

Summary

LangSmith evaluators check how good AI outputs are.

Use them to compare, grade, and improve AI answers.

They are easy to use by calling evaluate_strings with prediction and reference lists.