Which of the following best explains why L2 regularization helps prevent overfitting in a neural network?
Think about how controlling the size of weights affects model complexity.
L2 regularization adds a penalty to large weights, which discourages the model from fitting noise in the training data. This leads to simpler models that generalize better to new data.
Consider the following TensorFlow code snippet that trains a simple model with and without dropout. What will be the expected difference in training accuracy after 10 epochs?
import tensorflow as tf from tensorflow.keras import layers, models # Model without dropout model_no_dropout = models.Sequential([ layers.Dense(64, activation='relu', input_shape=(20,)), layers.Dense(1, activation='sigmoid') ]) model_no_dropout.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # Model with dropout model_dropout = models.Sequential([ layers.Dense(64, activation='relu', input_shape=(20,)), layers.Dropout(0.5), layers.Dense(1, activation='sigmoid') ]) model_dropout.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # Assume training data X_train, y_train are available # Both models trained for 10 epochs # What is the expected difference in training accuracy between the two models?
Dropout randomly disables neurons during training. How does this affect training accuracy?
Dropout prevents the model from relying too much on any neuron by randomly disabling them during training. This usually lowers training accuracy but improves generalization.
You train two neural networks on the same dataset: one with L2 regularization and one without. After training, you observe the following validation losses:
- Model with L2 regularization: 0.35
- Model without regularization: 0.60
What does this difference in validation loss indicate?
Lower validation loss usually means better performance on unseen data.
The model with L2 regularization has a lower validation loss, indicating it generalizes better and is less prone to overfitting compared to the model without regularization.
Given the following TensorFlow model code, the model still overfits the training data. What is the most likely reason?
import tensorflow as tf from tensorflow.keras import layers, models model = models.Sequential([ layers.Dense(128, activation='relu', input_shape=(30,)), layers.Dropout(0.2), layers.Dense(128, activation='relu'), layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # Model trained on small dataset for 50 epochs # Overfitting observed: training accuracy much higher than validation accuracy
Consider how dropout rate affects neuron deactivation during training.
A dropout rate of 0.2 means only 20% of neurons are randomly disabled, which might be insufficient to reduce overfitting on a small dataset with a large model.
You are training a deep convolutional neural network on a large image dataset. The model overfits despite using L2 regularization. Which additional regularization technique is most appropriate to try next?
Think about regularization methods that randomly deactivate neurons to prevent co-adaptation.
Adding dropout layers between convolutional layers helps prevent overfitting by randomly disabling neurons during training, forcing the network to learn more robust features.