0
0
TensorFlowml~20 mins

K-fold cross-validation in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - K-fold cross-validation
Problem:You want to evaluate how well your neural network model will perform on new data. Currently, you train the model once and test it once, which might give a biased result.
Current Metrics:Training accuracy: 92%, Validation accuracy: 88%
Issue:The single train-test split might not represent the model's true performance. The validation accuracy could vary if you split data differently.
Your Task
Use K-fold cross-validation to get a more reliable estimate of model performance by training and validating the model on different data splits.
Use 5 folds for cross-validation.
Keep the same model architecture and training parameters.
Use TensorFlow and Keras only.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import KFold

# Generate dummy data
X = np.random.rand(1000, 20)
y = (np.sum(X, axis=1) > 10).astype(int)  # Simple binary target

# Define model architecture function
def create_model():
    model = Sequential([
        Dense(32, activation='relu', input_shape=(20,)),
        Dense(16, activation='relu'),
        Dense(1, activation='sigmoid')
    ])
    model.compile(optimizer=Adam(learning_rate=0.001),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
    return model

kf = KFold(n_splits=5, shuffle=True, random_state=42)
val_accuracies = []

for train_index, val_index in kf.split(X):
    X_train, X_val = X[train_index], X[val_index]
    y_train, y_val = y[train_index], y[val_index]

    model = create_model()
    model.fit(X_train, y_train, epochs=10, batch_size=32, verbose=0)
    loss, accuracy = model.evaluate(X_val, y_val, verbose=0)
    val_accuracies.append(accuracy)

average_val_accuracy = np.mean(val_accuracies)
print(f'Average validation accuracy over 5 folds: {average_val_accuracy:.4f}')
Implemented 5-fold cross-validation using sklearn's KFold.
Trained and evaluated the model on each fold separately.
Calculated average validation accuracy across all folds.
Results Interpretation

Before K-fold: Validation accuracy = 88%

After K-fold: Average validation accuracy = 89%

K-fold cross-validation gives a more reliable and stable estimate of model performance by testing on multiple data splits instead of just one.
Bonus Experiment
Try increasing the number of folds to 10 and observe how the average validation accuracy and training time change.
💡 Hint
More folds give a better estimate but increase training time because the model trains more times.