Bird
Raised Fist0
Prompt Engineering / GenAIml~8 mins

ReAct pattern in Prompt Engineering / GenAI - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - ReAct pattern
Which metric matters for the ReAct pattern and WHY

The ReAct pattern combines reasoning and acting steps in AI models to improve decision-making. To evaluate it, we focus on accuracy and task success rate. Accuracy shows how often the model's final answers are correct. Task success rate measures if the model completes the intended task using its reasoning and actions. These metrics matter because ReAct aims to improve both understanding and execution, so we want to see if the model reasons well and acts correctly.

Confusion matrix or equivalent visualization
Confusion Matrix for ReAct model task completion:

               Predicted Success   Predicted Failure
Actual Success       85 (TP)            15 (FN)
Actual Failure       10 (FP)            90 (TN)

Total samples = 200

Precision = TP / (TP + FP) = 85 / (85 + 10) = 0.8947
Recall = TP / (TP + FN) = 85 / (85 + 15) = 0.85
F1 Score = 2 * (Precision * Recall) / (Precision + Recall) = 0.871
    

This matrix shows how well the ReAct model predicts successful task completion. High precision means most predicted successes are true. High recall means most actual successes are caught.

Precision vs Recall tradeoff with concrete examples

In ReAct models, precision and recall balance is key:

  • High Precision: The model rarely claims success unless very sure. Good when false success is costly, like medical advice generation.
  • High Recall: The model tries to catch all successes, even if some are wrong. Useful when missing a success is worse, like emergency response planning.

Choosing which to prioritize depends on the task. For example, a ReAct model helping with legal advice should have high precision to avoid wrong guidance. A ReAct model for search and rescue should have high recall to not miss any possible success.

What "good" vs "bad" metric values look like for ReAct pattern

Good metrics:

  • Accuracy above 85%
  • Precision and recall both above 80%
  • F1 score close to or above 85%
  • Consistent task success rate across different inputs

Bad metrics:

  • Accuracy below 70%
  • Precision or recall below 50%
  • Large gap between precision and recall (e.g., precision 90% but recall 30%)
  • Unstable task success rate, failing often on new inputs

Good metrics mean the ReAct model reasons and acts reliably. Bad metrics show it struggles to balance reasoning and action, leading to wrong or missed results.

Common pitfalls in evaluating ReAct pattern metrics
  • Accuracy paradox: High accuracy can be misleading if data is imbalanced (e.g., mostly failures). Always check precision and recall.
  • Data leakage: If the model sees answers during training, metrics will be unrealistically high.
  • Overfitting: Model performs well on training but poorly on new tasks, hiding in high training accuracy.
  • Ignoring task complexity: Metrics alone don't show if reasoning steps are meaningful or just memorized.
  • Not measuring intermediate reasoning quality: Only final output metrics miss how well the model reasons before acting.
Self-check question

Your ReAct model has 98% accuracy but only 12% recall on successful task completions. Is it good for production? Why or why not?

Answer: No, it is not good. The very low recall means the model misses most actual successes, even if it rarely makes false success claims. This means many tasks that should succeed are not recognized, which can be critical depending on the application. High accuracy alone is misleading here.

Key Result
For ReAct pattern, balanced precision and recall above 80% ensure reliable reasoning and acting.

Practice

(1/5)
1. What is the main purpose of the ReAct pattern in AI?
easy
A. To speed up AI training by skipping reasoning
B. To combine thinking and acting steps for better problem solving
C. To store large datasets efficiently
D. To replace human decision making completely

Solution

  1. Step 1: Understand the ReAct pattern concept

    The ReAct pattern mixes reasoning (thinking) and actions (doing) to solve problems step-by-step.
  2. Step 2: Identify the main goal

    This approach helps AI be more transparent and effective by breaking down tasks into Thought, Action, Observation, and Final Answer.
  3. Final Answer:

    To combine thinking and acting steps for better problem solving -> Option B
  4. Quick Check:

    ReAct = Reason + Act [OK]
Hint: Remember ReAct means think then do, step-by-step [OK]
Common Mistakes:
  • Thinking AI skips actions
  • ReAct stores data only
  • ReAct replaces humans fully
2. Which of the following shows the correct sequence in the ReAct pattern?
easy
A. Thought -> Action -> Observation -> Final Answer
B. Action -> Thought -> Final Answer -> Observation
C. Observation -> Final Answer -> Thought -> Action
D. Final Answer -> Thought -> Action -> Observation

Solution

  1. Step 1: Recall the ReAct step order

    The ReAct pattern follows a clear order: first the AI thinks (Thought), then acts (Action), then sees results (Observation), and finally gives the answer.
  2. Step 2: Match the correct sequence

    Thought -> Action -> Observation -> Final Answer correctly lists this order as Thought -> Action -> Observation -> Final Answer.
  3. Final Answer:

    Thought -> Action -> Observation -> Final Answer -> Option A
  4. Quick Check:

    Order = T -> A -> O -> FA [OK]
Hint: Think first, then act, observe, answer [OK]
Common Mistakes:
  • Mixing up Observation and Action order
  • Putting Final Answer before Observation
  • Skipping Thought step
3. Given this simplified ReAct code snippet:
thought = 'Check weather'
action = 'Query weather API'
observation = 'It is sunny'
final_answer = f"Weather is {observation}"
print(final_answer)

What will be the printed output?
medium
A. Check weather
B. It is sunny
C. Query weather API
D. Weather is It is sunny

Solution

  1. Step 1: Understand variable assignments

    The variable observation holds the string 'It is sunny'. The final_answer uses this to create 'Weather is It is sunny'.
  2. Step 2: Evaluate the print statement

    The print outputs the final_answer string, which is 'Weather is It is sunny' because the f-string inserts the full observation string.
  3. Final Answer:

    Weather is It is sunny -> Option D
  4. Quick Check:

    Output includes 'Weather is' + observation [OK]
Hint: Look at final_answer string formatting carefully [OK]
Common Mistakes:
  • Ignoring f-string variable insertion
  • Printing wrong variable
  • Confusing observation with action
4. Identify the error in this ReAct step code:
thought = 'Calculate sum'
action = 'Add 2 and 3'
observation = 2 + 3
final_answer = 'Sum is ' + observation
print(final_answer)
medium
A. Cannot concatenate string and integer directly
B. Missing action execution step
C. Observation should be a string, not a number
D. Final answer should be a number, not string

Solution

  1. Step 1: Analyze the final_answer concatenation

    The code tries to add a string 'Sum is ' and an integer observation (5) directly, which causes a TypeError in Python.
  2. Step 2: Identify the fix

    To fix, convert observation to string using str(observation) before concatenation.
  3. Final Answer:

    Cannot concatenate string and integer directly -> Option A
  4. Quick Check:

    String + int causes error [OK]
Hint: Convert numbers to strings before adding to text [OK]
Common Mistakes:
  • Ignoring type mismatch in concatenation
  • Thinking observation must be string always
  • Confusing action with observation
5. You want to build a ReAct-based AI assistant that solves math problems step-by-step. Which approach best applies the ReAct pattern?
hard
A. AI randomly guesses answers and checks correctness later
B. AI immediately gives the answer without intermediate steps
C. AI thinks about the problem, performs a calculation action, observes the result, then states the final answer
D. AI stores all previous answers without reasoning

Solution

  1. Step 1: Understand ReAct for stepwise problem solving

    The ReAct pattern requires the AI to think (reason), act (calculate), observe (check result), and then answer.
  2. Step 2: Match the approach to ReAct steps

    AI thinks about the problem, performs a calculation action, observes the result, then states the final answer describes this exact process, making the AI transparent and effective in solving math problems step-by-step.
  3. Final Answer:

    AI thinks about the problem, performs a calculation action, observes the result, then states the final answer -> Option C
  4. Quick Check:

    ReAct = Thought + Action + Observation + Answer [OK]
Hint: Follow Thought -> Action -> Observation -> Answer for stepwise AI [OK]
Common Mistakes:
  • Skipping reasoning steps
  • Guessing without observation
  • Ignoring stepwise transparency