This visual execution shows how to create and use custom evaluation metrics in LangChain. First, you define a metric function that compares a prediction to a reference and returns a score. Then, you pass this function to LangChain's Evaluation.evaluate method along with lists of predictions and references. LangChain calls your metric on each pair, collects the results, and computes an average score. The execution table traces each call and the aggregation step. The variable tracker shows how values change during evaluation. Key moments clarify why the metric returns 1 or 0 and how LangChain uses it. The quiz tests understanding of metric results and aggregation. This helps beginners see step-by-step how custom metrics work in LangChain evaluation.