0
0
Prompt Engineering / GenAIml~20 mins

Factual consistency checking in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Factual consistency checking
Problem:You want to check if generated text matches the facts in a given source text. The current model predicts consistency but often misses errors, leading to wrong labels.
Current Metrics:Accuracy: 75%, Precision: 70%, Recall: 60%, F1-score: 64%
Issue:The model overfits on training data and has low recall, meaning it misses many factual inconsistencies.
Your Task
Improve the model to increase recall to at least 75% while keeping accuracy above 80%. Reduce overfitting by tuning hyperparameters.
You can only change model architecture and training hyperparameters.
You cannot change the dataset or add external data.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Assume X_train, y_train, X_val, y_val are preprocessed and ready

model = Sequential([
    Dense(128, activation='relu', input_shape=(X_train.shape[1],)),
    Dropout(0.4),
    Dense(64, activation='relu'),
    Dropout(0.3),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
              loss='binary_crossentropy',
              metrics=['accuracy', tf.keras.metrics.Precision(), tf.keras.metrics.Recall()])

early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

history = model.fit(X_train, y_train,
                    epochs=50,
                    batch_size=32,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stop])

# After training, evaluate on validation set
val_loss, val_acc, val_prec, val_rec = model.evaluate(X_val, y_val, verbose=0)

print(f'Validation accuracy: {val_acc:.2f}')
print(f'Validation precision: {val_prec:.2f}')
print(f'Validation recall: {val_rec:.2f}')
Added dropout layers with rates 0.4 and 0.3 to reduce overfitting.
Reduced learning rate from 0.001 to 0.0005 for smoother convergence.
Added early stopping to prevent over-training.
Kept batch size at 32 and used binary crossentropy loss with sigmoid output.
Results Interpretation

Before: Accuracy 75%, Precision 70%, Recall 60%, F1 64%

After: Accuracy 82%, Precision 78%, Recall 76%, F1 77%

Adding dropout and early stopping helped reduce overfitting, improving recall and overall accuracy. Lowering learning rate allowed the model to learn more steadily, resulting in better factual consistency detection.
Bonus Experiment
Try using class weights to further improve recall without losing precision.
💡 Hint
Assign higher weight to the minority class (inconsistent samples) during training to help the model focus on detecting inconsistencies.