0
0
TensorFlowml~20 mins

Weight initialization strategies in TensorFlow - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Weight Initialization Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Why use He initialization in deep neural networks?

He initialization is often recommended for deep networks with ReLU activations. Why is this the case?

AIt randomly sets some weights to zero to create sparsity.
BIt sets all weights to zero to simplify training.
CIt initializes weights with very large values to speed up convergence.
DIt helps keep the variance of activations constant across layers, preventing vanishing or exploding gradients.
Attempts:
2 left
💡 Hint

Think about how activation variance affects gradient flow in deep networks.

Predict Output
intermediate
1:00remaining
Output shape after Xavier initialization in TensorFlow

What is the shape of the weights tensor initialized by the following code?

TensorFlow
import tensorflow as tf
initializer = tf.keras.initializers.GlorotUniform()
weights = initializer(shape=(64, 128))
print(weights.shape)
A(64,)
B(128, 64)
C(64, 128)
D(128,)
Attempts:
2 left
💡 Hint

Look at the shape argument passed to the initializer.

Hyperparameter
advanced
1:30remaining
Choosing initialization for sigmoid activation

Which weight initialization strategy is best suited for a network using sigmoid activation functions to reduce vanishing gradients?

AHe initialization
BXavier (Glorot) initialization
CRandom normal with mean 0 and stddev 1
DAll zeros initialization
Attempts:
2 left
💡 Hint

Consider the activation function's output range and how initialization affects gradient flow.

Metrics
advanced
1:30remaining
Effect of poor weight initialization on training metrics

What is the most likely effect on training loss and accuracy if weights are initialized with very large random values?

ATraining loss will be unstable and accuracy will be low due to exploding gradients.
BTraining loss will decrease smoothly and accuracy will improve quickly.
CTraining loss will be zero from the start and accuracy will be perfect.
DTraining loss and accuracy will not change because initialization does not affect training.
Attempts:
2 left
💡 Hint

Think about how large initial weights affect gradient calculations.

🔧 Debug
expert
2:00remaining
Identify the error in this custom weight initializer

What error will this TensorFlow custom initializer code raise?

TensorFlow
import tensorflow as tf
class CustomInit(tf.keras.initializers.Initializer):
    def __call__(self, shape, dtype=None):
        return tf.random.uniform(shape, minval=-1, maxval=1, dtype=dtype)

initializer = CustomInit()
weights = initializer(shape=(32, 32))
ANo error, code runs correctly
BAttributeError because __call__ method is missing
CValueError because shape is not a tuple
DTypeError because dtype is not passed to tf.random.uniform
Attempts:
2 left
💡 Hint

tf.random.uniform has a default dtype=tf.float32, so no error occurs.