Weight initialization helps the model learn better and faster by starting with good values for its weights.
Weight initialization strategies in TensorFlow
initializer = tf.keras.initializers.HeNormal() layer = tf.keras.layers.Dense(units=64, activation='relu', kernel_initializer=initializer)
You choose an initializer and pass it to the layer's kernel_initializer argument.
Common initializers include GlorotUniform, HeNormal, and RandomNormal.
initializer = tf.keras.initializers.GlorotUniform() layer = tf.keras.layers.Dense(32, activation='relu', kernel_initializer=initializer)
initializer = tf.keras.initializers.HeNormal() layer = tf.keras.layers.Dense(64, activation='relu', kernel_initializer=initializer)
initializer = tf.keras.initializers.RandomNormal(mean=0., stddev=0.05) layer = tf.keras.layers.Dense(10, activation='softmax', kernel_initializer=initializer)
This code builds a small neural network using the HeNormal initializer for weights. It trains on random data for 3 epochs and then makes predictions on new random inputs.
import tensorflow as tf from tensorflow.keras import layers, models # Create a simple model with HeNormal initializer initializer = tf.keras.initializers.HeNormal() model = models.Sequential([ layers.Dense(64, activation='relu', kernel_initializer=initializer, input_shape=(20,)), layers.Dense(10, activation='softmax', kernel_initializer=initializer) ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # Generate dummy data import numpy as np x_train = np.random.rand(100, 20).astype('float32') y_train = np.random.randint(0, 10, size=(100,)) # Train the model history = model.fit(x_train, y_train, epochs=3, batch_size=10, verbose=2) # Make predictions on new data x_test = np.random.rand(5, 20).astype('float32') predictions = model.predict(x_test) print('Predictions shape:', predictions.shape) print('First prediction:', predictions[0])
Choosing the right initializer can help your model train faster and avoid common problems.
He initialization is best for ReLU activations, while Glorot (Xavier) is good for sigmoid or tanh.
Always set the initializer when creating layers to control how weights start.
Weight initialization sets starting values for model weights to help learning.
Use HeNormal for ReLU and GlorotUniform for sigmoid/tanh activations.
Proper initialization prevents slow or stuck training and improves accuracy.