Bird
Raised Fist0
Agentic AIml~12 mins

Reflection and self-critique pattern in Agentic AI - Model Pipeline Trace

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Model Pipeline - Reflection and self-critique pattern

This pipeline shows how an AI agent learns by reflecting on its own actions and self-critiquing to improve future decisions. It mimics how people think about what they did and how to do better next time.

Data Flow - 5 Stages
1Initial Input
1 agent state x 10 featuresAgent receives current environment state and task info1 agent state x 10 features
Agent sees: position=5, goal=10, energy=7, last_action=move_forward
2Action Generation
1 agent state x 10 featuresAgent decides next action based on current state1 action vector x 3 possible actions
Agent outputs probabilities: move_forward=0.7, turn_left=0.2, wait=0.1
3Environment Response
1 action vector x 3Environment updates state based on action1 new agent state x 10 features
Agent new state: position=6, energy=6, last_action=move_forward
4Reflection and Self-Critique
1 new agent state x 10 featuresAgent evaluates its last action outcome and scores success1 critique score scalar
Agent critique: action_success=0.8 (good but can improve)
5Policy Update
1 critique score scalarAgent adjusts decision-making policy to improve future actionsUpdated policy parameters
Agent increases preference for move_forward in similar states
Training Trace - Epoch by Epoch

Loss
0.7 |****
0.6 |*** 
0.5 |**  
0.4 |*   
0.3 |*   
0.2 |    
     1 2 3 4 5 Epochs
EpochLoss ↓Accuracy ↑Observation
10.650.40Agent starts with random actions, low success
20.500.55Reflection helps agent learn from mistakes, improving decisions
30.380.70Agent better predicts good actions, loss decreases steadily
40.300.78Self-critique refines policy, accuracy climbs
50.250.83Agent converges to effective strategy with high success
Prediction Trace - 5 Layers
Layer 1: Input State
Layer 2: Action Generation
Layer 3: Environment Update
Layer 4: Reflection and Self-Critique
Layer 5: Policy Update
Model Quiz - 3 Questions
Test your understanding
What is the main purpose of the reflection and self-critique stage?
ATo randomly select the next action
BTo reset the agent's state to the beginning
CTo evaluate the success of the last action and improve future decisions
DTo increase the agent's energy level
Key Insight
Reflection and self-critique allow an AI agent to learn from its own experiences by evaluating past actions and improving its decision-making policy. This feedback loop helps the agent become more effective over time.

Practice

(1/5)
1. What is the main purpose of the Reflection and self-critique pattern in AI?
easy
A. To store large amounts of data
B. To speed up AI computations
C. To help AI review and improve its own answers
D. To create new AI models automatically

Solution

  1. Step 1: Understand the pattern's goal

    The reflection and self-critique pattern is designed to let AI look back at its answers and find mistakes.
  2. Step 2: Identify the main benefit

    By reviewing its own work, AI can fix errors and improve future responses.
  3. Final Answer:

    To help AI review and improve its own answers -> Option C
  4. Quick Check:

    Reflection and self-critique = improve answers [OK]
Hint: Focus on improvement through self-review [OK]
Common Mistakes:
  • Confusing speed with accuracy
  • Thinking it stores data
  • Assuming it creates new models
2. Which of the following is the correct way to describe the reflection step in the pattern?
easy
A. AI reviews its previous answers to find mistakes
B. AI ignores previous answers and generates new ones
C. AI deletes all previous data to start fresh
D. AI copies answers from other models without checking

Solution

  1. Step 1: Define reflection in AI context

    Reflection means looking back at past answers to check for errors or improvements.
  2. Step 2: Match options to definition

    Only AI reviews its previous answers to find mistakes correctly states that AI reviews previous answers to find mistakes.
  3. Final Answer:

    AI reviews its previous answers to find mistakes -> Option A
  4. Quick Check:

    Reflection = review past answers [OK]
Hint: Reflection means reviewing past work carefully [OK]
Common Mistakes:
  • Thinking reflection means ignoring past answers
  • Confusing reflection with deleting data
  • Assuming copying answers is reflection
3. Consider this simple AI pseudo-code using reflection and self-critique:
answer = AI.generate_answer(question)
errors = AI.reflect(answer)
if errors:
    answer = AI.fix_errors(answer, errors)
print(answer)

What will print(answer) show if the AI finds errors?
medium
A. The original answer without changes
B. The corrected answer after fixing errors
C. No output because the program stops
D. An error message instead of an answer

Solution

  1. Step 1: Understand the code flow

    The AI first generates an answer, then reflects to find errors. If errors exist, it fixes them.
  2. Step 2: Determine the final printed output

    Since errors are fixed before printing, the output is the corrected answer.
  3. Final Answer:

    The corrected answer after fixing errors -> Option B
  4. Quick Check:

    Errors fixed before print = corrected answer [OK]
Hint: Errors fixed before print means corrected output [OK]
Common Mistakes:
  • Assuming original answer prints despite errors
  • Thinking program stops on errors
  • Confusing error message with fixed answer
4. You have this AI code snippet:
answer = AI.generate_answer(question)
errors = AI.reflect(answer)
if errors:
    AI.fix_errors(answer, errors)
print(answer)

Why might this code fail to print the corrected answer?
medium
A. Because fix_errors does not update answer variable
B. Because reflect never finds errors
C. Because print is called before generating answer
D. Because answer is not defined

Solution

  1. Step 1: Analyze variable updates

    The fix_errors function is called but its result is not assigned back to answer.
  2. Step 2: Understand impact on output

    Since answer is unchanged, print shows the original, not corrected, answer.
  3. Final Answer:

    Because fix_errors does not update answer variable -> Option A
  4. Quick Check:

    Fix must assign back to answer [OK]
Hint: Assign fixed answer back to variable before printing [OK]
Common Mistakes:
  • Assuming reflect never finds errors
  • Thinking print is called too early
  • Ignoring variable assignment after fixing
5. You want to improve an AI assistant using the reflection and self-critique pattern. Which approach best applies this pattern to reduce repeated mistakes over time?
hard
A. AI copies answers from a fixed database without checking
B. AI generates answers randomly to explore new possibilities
C. AI deletes old answers to save memory without review
D. After each answer, AI reviews its response, identifies errors, fixes them, and updates its knowledge base

Solution

  1. Step 1: Identify key steps in the pattern

    The pattern involves reviewing answers, finding errors, fixing them, and learning from mistakes.
  2. Step 2: Match approach to pattern goals

    After each answer, AI reviews its response, identifies errors, fixes them, and updates its knowledge base describes reviewing, fixing, and updating knowledge, which fits the pattern perfectly.
  3. Final Answer:

    After each answer, AI reviews its response, identifies errors, fixes them, and updates its knowledge base -> Option D
  4. Quick Check:

    Review + fix + learn = improved AI [OK]
Hint: Choose option with review, fix, and learning steps [OK]
Common Mistakes:
  • Ignoring learning from errors
  • Choosing random or fixed answers
  • Skipping error identification