Prompt Engineering / GenAIml~8 mins

Message roles (system, user, assistant) in Prompt Engineering / GenAI - Model Metrics & Evaluation

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Metrics & Evaluation - Message roles (system, user, assistant)

Which metric matters for Message roles and WHY

When working with message roles like system, user, and assistant in AI chat models, the key metric is accuracy of role classification or correct role assignment. This ensures the model understands who is speaking and responds appropriately. For example, the system role sets instructions, the user role asks questions, and the assistant role replies. Correct role recognition helps the AI behave as expected.

Confusion matrix for role classification

      | Predicted \ Actual | System | User | Assistant |
      |--------------------|--------|------|-----------|
      | System             | 90     | 5    | 5         |
      | User               | 3      | 92   | 5         |
      | Assistant          | 2      | 4    | 94        |

      Total samples = 300

      Precision and recall per role:
      - System Precision = 90 / (90 + 3 + 2) = 90 / 95 = 0.947
      - System Recall = 90 / (90 + 5 + 5) = 90 / 100 = 0.9
      - User Precision = 92 / (5 + 92 + 4) = 92 / 101 = 0.910
      - User Recall = 92 / (3 + 92 + 5) = 92 / 100 = 0.92
      - Assistant Precision = 94 / (5 + 5 + 94) = 94 / 104 = 0.904
      - Assistant Recall = 94 / (2 + 4 + 94) = 94 / 100 = 0.94

Precision vs Recall tradeoff with examples

In message role classification, precision means how often the predicted role is correct. Recall means how many actual messages of a role were found.

If precision is low, the model confuses roles often, causing wrong responses. If recall is low, some messages are missed or misclassified, leading to ignored instructions or questions.

Example: If the system role is confused with user role, the AI might ignore important instructions. Here, high recall for system role is critical to catch all instructions.

Example: If user messages are misclassified as assistant, the AI might respond to itself, causing confusion. High precision for user role avoids this.

What good vs bad metric values look like

Good: Precision and recall above 90% for all roles means the model correctly identifies who is speaking most of the time.

Bad: Precision or recall below 70% means many messages are misclassified. For example, if system role recall is 50%, half of instructions are missed, causing poor AI behavior.

Common pitfalls in metrics

Accuracy paradox: If one role is very common, high accuracy can hide poor performance on rare roles.
Data leakage: Training data containing future messages can inflate metrics falsely.
Overfitting: Model memorizes training roles but fails on new conversations.
Ignoring role context: Metrics without considering conversation flow can mislead about real performance.

Self-check question

Your model has 98% accuracy but only 12% recall on the system role. Is it good for production? Why or why not?

Answer: No, it is not good. Even though overall accuracy is high, the model misses 88% of system messages (instructions). This means important instructions are ignored, causing the AI to behave incorrectly. High recall on system role is critical.

Key Result

High precision and recall for each message role ensure the AI correctly understands and responds to system, user, and assistant messages.

Practice

(1/5)

1. What is the main purpose of the system role in AI chat messages?

easy

A. To store conversation history

B. To provide user input or questions

C. To set instructions and guide the AI's behavior

D. To display the AI's answers

Message roles (system, user, assistant) in Prompt Engineering / GenAI - Model Metrics & Evaluation

Start learning this pattern below

Practice

Solution

Step 1: Understand the roles in AI chat

Step 2: Differentiate from other roles

Final Answer:

Quick Check:

Solution

Step 1: Identify the role for user input

Step 2: Check each option's role field

Final Answer:

Quick Check:

Solution

Step 1: Understand message roles in conversation

Step 2: Identify the role for AI replies

Final Answer:

Quick Check:

Solution

Step 1: Check message sequence rules

Step 2: Identify the problem in the message list

Final Answer:

Quick Check:

Solution

Step 1: Place instructions in the system role

Step 2: Check message order and roles

Final Answer:

Quick Check: