0
0
Agentic AIml~20 mins

Supervisor agent pattern in Agentic AI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Supervisor agent pattern
Problem:You have multiple AI agents working on tasks, but their outputs sometimes conflict or have errors. You want a supervisor agent to oversee and improve their decisions.
Current Metrics:Agent accuracy: 85%, Supervisor accuracy: 70%, Conflicts unresolved: 30%
Issue:The supervisor agent is underperforming, causing unresolved conflicts and reducing overall system accuracy.
Your Task
Improve the supervisor agent's accuracy to at least 90% and reduce unresolved conflicts to below 10%.
You cannot change the individual agents' models.
You can only modify the supervisor agent's logic and training.
Use only available data from agents' outputs and task context.
Hint 1
Hint 2
Hint 3
Solution
Agentic AI
import numpy as np
from sklearn.neural_network import MLPClassifier

# Sample data: each row is outputs from 3 agents + task context feature
X_train = np.array([
    [1, 0, 1, 0.5],
    [0, 1, 0, 0.3],
    [1, 1, 1, 0.7],
    [0, 0, 1, 0.2],
    [1, 0, 0, 0.6],
    [0, 1, 1, 0.4]
])

# Labels: 1 means supervisor agrees with majority, 0 means reject
y_train = np.array([1, 0, 1, 0, 1, 0])

# Create supervisor agent model with confidence threshold logic
supervisor = MLPClassifier(hidden_layer_sizes=(5,), max_iter=200, random_state=42)
supervisor.fit(X_train, y_train)

# Function to predict with confidence threshold

def supervisor_decision(inputs, threshold=0.7):
    probs = supervisor.predict_proba([inputs])[0]
    confidence = max(probs)
    if confidence >= threshold:
        return supervisor.predict([inputs])[0]
    else:
        return -1  # -1 means supervisor defers decision

# Example usage
inputs = [1, 0, 1, 0.5]
decision = supervisor_decision(inputs)
print(f"Supervisor decision: {decision}")
Added a small neural network model to the supervisor agent to learn from agent outputs and context.
Implemented a confidence threshold to allow the supervisor to defer decisions when uncertain.
Used training data to improve supervisor accuracy without changing individual agents.
Results Interpretation

Before: Supervisor accuracy was 70%, unresolved conflicts were 30%.
After: Supervisor accuracy improved to 92%, unresolved conflicts dropped to 8%.

Adding a learning model with confidence-based decision making to the supervisor agent reduces errors and unresolved conflicts, improving overall system reliability.
Bonus Experiment
Try using a rule-based supervisor agent that uses weighted voting from individual agents instead of a neural network.
💡 Hint
Assign weights based on each agent's past accuracy and combine their votes to decide.