Agent memory and state help an AI remember past information to make better decisions. To check if memory works well, we look at accuracy of the agent's responses or actions over time. We also use consistency metrics to see if the agent keeps track of facts correctly across steps. For tasks like conversation, recall is important to ensure the agent remembers key details. For decision-making, precision matters to avoid wrong actions based on bad memory.
Agent memory and state in Prompt Engineering / GenAI - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
Confusion Matrix for Agent's memory recall:
Predicted Remembered Predicted Forgotten
Actual Remembered TP = 80 FN = 20
Actual Forgotten FP = 10 TN = 90
Total samples = 200
- TP (True Positive): Agent correctly remembers a fact.
- FN (False Negative): Agent forgets a fact it should remember.
- FP (False Positive): Agent recalls something incorrectly.
- TN (True Negative): Agent correctly forgets irrelevant info.
Precision = TP / (TP + FP) = 80 / (80 + 10) = 0.89
Recall = TP / (TP + FN) = 80 / (80 + 20) = 0.80
F1 Score = 2 * (0.89 * 0.80) / (0.89 + 0.80) = 0.84
Imagine an AI assistant that remembers your preferences:
- High Precision: The assistant only recalls preferences it is very sure about. This avoids wrong suggestions but might miss some preferences (lower recall).
- High Recall: The assistant tries to remember all preferences, even uncertain ones. This catches more preferences but risks wrong recalls (lower precision).
For example, if the assistant forgets your favorite music genre (low recall), it may suggest bad songs. If it wrongly recalls a genre you dislike (low precision), it annoys you. Balancing precision and recall depends on what matters more: avoiding mistakes or remembering everything.
Good metrics:
- Precision and recall above 0.85 show the agent remembers facts well and rarely makes wrong recalls.
- Consistency scores near 1.0 mean the agent keeps state stable over time.
- Low false negatives (FN) so important info is not forgotten.
Bad metrics:
- Precision or recall below 0.5 means the agent often forgets or wrongly recalls facts.
- High false positives (FP) cause wrong actions based on bad memory.
- Inconsistent state leads to confusing or contradictory responses.
- Accuracy paradox: High overall accuracy can hide poor memory on rare but important facts.
- Data leakage: If test data includes info the agent already saw, metrics overestimate memory quality.
- Overfitting: Agent memorizes training data exactly but fails to generalize to new info.
- Ignoring temporal consistency: Metrics that don't check if memory stays stable over time miss key issues.
Your agent has 98% accuracy but only 12% recall on important facts it should remember. Is it good for production? Why or why not?
Answer: No, it is not good. The high accuracy likely comes from many easy cases or irrelevant info. The very low recall means the agent forgets most important facts, which harms user experience and trust. Improving recall is critical before production.
Practice
agent memory in AI systems?Solution
Step 1: Understand agent memory role
Agent memory is designed to keep past information so the AI can remember what happened before.Step 2: Differentiate from agent state
Agent state holds current context, not past data. Memory is about storing history.Final Answer:
To store past information for future use -> Option BQuick Check:
Agent memory = store past info [OK]
- Confusing memory with current state
- Thinking memory deletes old data automatically
- Assuming memory processes new input instantly
Solution
Step 1: Identify assignment syntax
In Python, to update a variable, use a single equals sign=.Step 2: Check other options
==is comparison,:=is assignment expression but not typical for state update,+=adds values, not replaces.Final Answer:
agent_state = new_state -> Option AQuick Check:
Use = for assignment [OK]
- Using == instead of = for assignment
- Confusing := with = in simple updates
- Using += when replacement is needed
agent_memory = []
agent_state = {'mood': 'neutral'}
# Agent receives new info
new_info = 'happy'
# Update memory and state
agent_memory.append(new_info)
agent_state['mood'] = new_info
print(agent_memory, agent_state)
What will be the output?Solution
Step 1: Analyze memory update
The code appendsnew_info('happy') toagent_memory, so memory becomes ['happy'].Step 2: Analyze state update
The agent's state key 'mood' is updated to 'happy'.Final Answer:
["happy"] {'mood': 'happy'} -> Option DQuick Check:
Memory and state updated with 'happy' [OK]
- Forgetting append adds to list
- Confusing state key value with memory content
- Assuming memory or state unchanged
agent_memory = []
agent_state = {'status': 'idle'}
new_data = 'active'
# Intended to update memory and state
agent_memory = agent_memory.append(new_data)
agent_state['status'] == new_data
print(agent_memory, agent_state)
What is the main error causing unexpected output?Solution
Step 1: Check memory update line
append()modifies list in place and returns None. Assigning it back setsagent_memoryto None.Step 2: Check state update line
The line uses==which compares but does not assign, so state remains unchanged.Final Answer:
Using append() return value to assign memory -> Option CQuick Check:
append() returns None, don't assign it [OK]
- Assigning append() result to list variable
- Using == instead of = for assignment
- Ignoring that append modifies list in place
Solution
Step 1: Understand memory role for long-term data
Agent memory stores past info like user preferences across sessions.Step 2: Understand state role for current context
Agent state holds current session details to adjust behavior immediately.Final Answer:
Use agent memory to store preferences and agent state to track current session context -> Option AQuick Check:
Memory = long-term, state = current context [OK]
- Using state for permanent storage
- Ignoring memory for preferences
- Resetting memory loses past info
