Introduction
LangSmith evaluators help you check how well your language models or chains are working. They give you clear feedback so you can improve your AI tools.
Jump into concepts and practice - no test required
LangSmith evaluators help you check how well your language models or chains are working. They give you clear feedback so you can improve your AI tools.
from langchain.evaluation import load_evaluator evaluator = load_evaluator("exact_match") result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0]
from langchain.evaluation import load_evaluator evaluator = load_evaluator("exact_match") prediction = "Hello, world!" reference = "Hello, world!" result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0] print(result)
from langchain.evaluation import load_evaluator evaluator = load_evaluator("exact_match") prediction = "Yes" reference = "Yes" result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0] print(result)
This program uses the exact_match evaluator to check if the AI's answer matches the correct answer exactly. It prints the evaluation score.
from langchain.evaluation import load_evaluator # Create an evaluator for string comparison evaluator = load_evaluator("exact_match") # AI's answer prediction = "The capital of France is Paris." # Correct answer reference = "The capital of France is Paris." # Evaluate the prediction result = evaluator.evaluate_strings(predictions=[prediction], references=[reference])[0] print(f"Evaluation result: {result['score']}")
Evaluators help you measure AI quality without guessing.
Choose the right evaluator type for your task to get useful feedback.
Evaluation results can be used to improve your AI models step by step.
LangSmith evaluators check how good AI outputs are.
Use them to compare, grade, and improve AI answers.
They are easy to use by calling evaluate_strings with prediction and reference lists.
evaluator = SomeEvaluator() prediction = "The sky is blue." reference = "The sky is clear and blue." result = evaluator.evaluate(prediction, reference) print(result)
print(result)?evaluator = SomeEvaluator() result = evaluator.evaluate(reference, prediction) print(result)
evaluate expects (prediction, reference) order.