Introduction
Evaluation helps catch mistakes early by testing how your code works before using it for real tasks.
Jump into concepts and practice - no test required
Evaluation helps catch mistakes early by testing how your code works before using it for real tasks.
from langchain.evaluation import load_evaluator evaluator = load_evaluator("exact_match") result = evaluator.evaluate_strings(prediction=generated_output, reference=expected_output)["score"]
evaluate_strings to compare your model's output with what you expect.from langchain.evaluation import load_evaluator evaluator = load_evaluator("exact_match") score = evaluator.evaluate_strings(prediction="Hello world", reference="Hello world")["score"]
score = evaluator.evaluate_strings(prediction="Hi world", reference="Hello world")["score"]
This program compares a generated sentence with the expected one and prints the evaluation score. A perfect match gives a high score, showing no errors.
from langchain.evaluation import load_evaluator # Create evaluator instance evaluator = load_evaluator("exact_match") # Simulate generated and expected outputs generated = "The quick brown fox jumps over the lazy dog" expected = "The quick brown fox jumps over the lazy dog" # Evaluate the outputs score = evaluator.evaluate_strings(prediction=generated, reference=expected)["score"] print(f"Evaluation score: {score}")
Always evaluate your chains before production to catch errors early.
Evaluation helps improve your prompts and model responses step-by-step.
Evaluation tests your code's output before real use.
It helps find and fix problems early.
Using evaluation improves reliability and user experience.
my_chain?evaluate().run_evaluation(), evaluate_chain(), or eval() are not valid LangChain methods.result = my_chain.evaluate(input_data={'text': 'Hello'})
print(result)my_chain has a bug causing it to return None instead of a string?evaluate method returns the chain's output or None if there's a bug.None will display the word None in the console, not an error.None indicating a problem. -> Option Aresult = my_chain.evaluate(input_data={'text': 'Test'})
print(result)TypeError saying evaluate() got an unexpected keyword argument 'input_data'. What is the likely cause?evaluate() got an unexpected keyword argument input_data, meaning this argument is invalid.evaluate method expects inputs differently, not as input_data. Passing unknown keywords causes this error.evaluate method does not accept input_data as a parameter. -> Option B