Recall & Review
beginner
What is the purpose of weight initialization in neural networks?
Weight initialization sets the starting values of the model's weights before training. Good initialization helps the model learn faster and avoid problems like vanishing or exploding gradients.
Click to reveal answer
intermediate
Explain the difference between Xavier (Glorot) and He initialization.
Xavier initialization sets weights to keep variance of activations similar across layers, good for sigmoid/tanh activations. He initialization adjusts weights for ReLU activations to better handle their behavior and avoid dying neurons.
Click to reveal answer
beginner
What problem does He initialization help to solve?
He initialization helps prevent the problem of dying ReLU neurons by scaling weights properly so that the variance of outputs stays stable through layers.
Click to reveal answer
beginner
How does random initialization differ from zero initialization, and why is zero initialization usually bad?
Random initialization assigns small random values to weights, breaking symmetry so neurons learn different features. Zero initialization sets all weights to zero, causing all neurons to learn the same thing and preventing effective training.
Click to reveal answer
beginner
Name two common TensorFlow functions used for weight initialization.
Two common TensorFlow initializers are tf.keras.initializers.GlorotUniform() for Xavier initialization and tf.keras.initializers.HeNormal() for He initialization.
Click to reveal answer
Which weight initialization method is best suited for ReLU activation functions?
✗ Incorrect
He initialization is designed to work well with ReLU activations by scaling weights to keep variance stable.
Why is zero initialization of weights usually a bad idea?
✗ Incorrect
Zero initialization causes symmetry where all neurons update identically, preventing the network from learning diverse features.
What does Xavier initialization aim to keep stable across layers?
✗ Incorrect
Xavier initialization keeps the variance of activations stable to help gradients flow well during training.
Which TensorFlow initializer corresponds to Xavier initialization?
✗ Incorrect
GlorotUniform is TensorFlow's implementation of Xavier initialization.
What problem can occur if weights are initialized with very large values?
✗ Incorrect
Large initial weights can cause exploding gradients, making training unstable.
Describe why weight initialization matters and name two common strategies.
Think about how starting weights affect learning speed and stability.
You got /3 concepts.
Explain how He initialization helps with ReLU activations and what problem it prevents.
Consider the behavior of ReLU and how weights influence neuron output.
You got /3 concepts.