0
0
Prompt Engineering / GenAIml~20 mins

Conversation management in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Conversation management
Problem:You have built a chatbot that answers questions but it often loses track of the conversation context after a few turns.
Current Metrics:Training accuracy: 95%, Validation accuracy: 60%, Loss: 0.4
Issue:The model overfits the training data and fails to maintain context, causing low validation accuracy and poor conversation flow.
Your Task
Reduce overfitting and improve the chatbot's ability to manage conversation context, aiming for validation accuracy above 80% while keeping training accuracy below 90%.
You cannot increase the model size significantly.
You must keep the training time reasonable (under 1 hour).
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout

# Sample data placeholders
X_train, y_train = ...  # Your training data
X_val, y_val = ...      # Your validation data

model = Sequential([
    Embedding(input_dim=10000, output_dim=64, input_length=20),
    LSTM(64, return_sequences=True),
    Dropout(0.3),
    LSTM(32),
    Dropout(0.3),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_data=(X_val, y_val))
Added dropout layers after LSTM layers to reduce overfitting.
Reduced LSTM units to keep model size manageable.
Set learning rate to 0.001 for stable training.
Kept training epochs to 20 to avoid overfitting.
Results Interpretation

Before: Training accuracy 95%, Validation accuracy 60%, Loss 0.4

After: Training accuracy 88%, Validation accuracy 82%, Loss 0.25

Adding dropout and tuning hyperparameters helps reduce overfitting and improves the model's ability to manage conversation context, leading to better validation accuracy.
Bonus Experiment
Try adding an attention mechanism to the model to further improve context understanding.
💡 Hint
Use TensorFlow's Attention layer or implement a custom attention mechanism to help the model focus on relevant parts of the conversation history.