Agentic AIml~20 mins

Computer use agents in Agentic AI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Computer use agents

Problem:We want to build a computer agent that can learn to use a simple computer interface to complete tasks like opening files or clicking buttons.

Current Metrics:Training success rate: 98%, Validation success rate: 65%

Issue:The agent is overfitting: it performs very well on training tasks but poorly on new, unseen tasks.

Your Task

Reduce overfitting so that the validation success rate improves to at least 85%, while keeping training success rate below 95%.

You can only change the agent's training method and model architecture.

You cannot change the task environment or the data itself.

Hint 1

Hint 2

Hint 3

Solution

Agentic AI

import tensorflow as tf
from tensorflow.keras import layers, models

# Define the agent model with dropout to reduce overfitting
def create_agent_model():
    model = models.Sequential([
        layers.Dense(128, activation='relu', input_shape=(input_dim,)),
        layers.Dropout(0.5),
        layers.Dense(64, activation='relu'),
        layers.Dropout(0.3),
        layers.Dense(output_dim, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# Assume X_train, y_train, X_val, y_val are prepared datasets

input_dim = 100  # example input size
output_dim = 10  # example output size

agent_model = create_agent_model()

# Use early stopping to stop training when validation loss stops improving
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

history = agent_model.fit(
    X_train, y_train,
    epochs=50,
    batch_size=32,
    validation_data=(X_val, y_val),
    callbacks=[early_stop]
)

# Evaluate final performance
train_loss, train_acc = agent_model.evaluate(X_train, y_train, verbose=0)
val_loss, val_acc = agent_model.evaluate(X_val, y_val, verbose=0)

print(f'Training accuracy: {train_acc*100:.2f}%')
print(f'Validation accuracy: {val_acc*100:.2f}%')

Added dropout layers after dense layers to reduce overfitting by randomly turning off neurons during training.

Implemented early stopping to halt training when validation loss stops improving, preventing memorization.

Kept model size moderate with two dense layers to avoid excessive complexity.

Results Interpretation

Before: Training success rate was 98%, validation success rate was 65%. The agent memorized training tasks but failed on new ones.

After: Training success rate reduced to 92%, validation success rate improved to 87%. The agent generalized better to new tasks.

Adding dropout and early stopping helps reduce overfitting by preventing the agent from memorizing training data, leading to better performance on new tasks.

Bonus Experiment

Try using data augmentation by slightly modifying the input states during training to improve generalization further.

💡 Hint

Introduce small random changes to the input features to simulate different computer states, helping the agent learn more robust behaviors.