Prompt Engineering / GenAIml~20 mins

Factual consistency checking in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Factual consistency checking

Problem:You want to check if generated text matches the facts in a given source text. The current model predicts consistency but often misses errors, leading to wrong labels.

Current Metrics:Accuracy: 75%, Precision: 70%, Recall: 60%, F1-score: 64%

Issue:The model overfits on training data and has low recall, meaning it misses many factual inconsistencies.

Your Task

Improve the model to increase recall to at least 75% while keeping accuracy above 80%. Reduce overfitting by tuning hyperparameters.

You can only change model architecture and training hyperparameters.

You cannot change the dataset or add external data.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

Prompt Engineering / GenAI

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Assume X_train, y_train, X_val, y_val are preprocessed and ready

model = Sequential([
    Dense(128, activation='relu', input_shape=(X_train.shape[1],)),
    Dropout(0.4),
    Dense(64, activation='relu'),
    Dropout(0.3),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
              loss='binary_crossentropy',
              metrics=['accuracy', tf.keras.metrics.Precision(), tf.keras.metrics.Recall()])

early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

history = model.fit(X_train, y_train,
                    epochs=50,
                    batch_size=32,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stop])

# After training, evaluate on validation set
val_loss, val_acc, val_prec, val_rec = model.evaluate(X_val, y_val, verbose=0)

print(f'Validation accuracy: {val_acc:.2f}')
print(f'Validation precision: {val_prec:.2f}')
print(f'Validation recall: {val_rec:.2f}')

Added dropout layers with rates 0.4 and 0.3 to reduce overfitting.

Reduced learning rate from 0.001 to 0.0005 for smoother convergence.

Added early stopping to prevent over-training.

Kept batch size at 32 and used binary crossentropy loss with sigmoid output.

Results Interpretation

Before: Accuracy 75%, Precision 70%, Recall 60%, F1 64%

After: Accuracy 82%, Precision 78%, Recall 76%, F1 77%

Adding dropout and early stopping helped reduce overfitting, improving recall and overall accuracy. Lowering learning rate allowed the model to learn more steadily, resulting in better factual consistency detection.

Bonus Experiment

Try using class weights to further improve recall without losing precision.

💡 Hint

Assign higher weight to the minority class (inconsistent samples) during training to help the model focus on detecting inconsistencies.

Practice

(1/5)

1. What is the main purpose of factual consistency checking in AI-generated text?

easy

A. To reduce the size of the AI model

B. To improve the speed of AI text generation

C. To make AI text more creative and imaginative

D. To ensure the AI's output matches true and reliable information

Factual consistency checking in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the goal of factual consistency checking

Step 2: Compare options with this goal

Final Answer:

Quick Check:

Solution

Step 1: Identify simple factual checking methods

Step 2: Match options to this method

Final Answer:

Quick Check:

Solution

Step 1: Compare key facts in both sentences

Step 2: Determine factual consistency

Final Answer:

Quick Check:

Solution

Step 1: Analyze the checker behavior

Step 2: Identify the cause

Final Answer:

Quick Check:

Solution

Step 1: Understand combining methods

Step 2: Evaluate options

Final Answer:

Quick Check: