What is Batch normalization in TensorFlow?

TensorFlowml~5 mins

Batch normalization in TensorFlow

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

Batch normalization helps models learn faster and better by keeping data balanced inside the network.

When training deep neural networks to make learning more stable.

When you want to reduce the time it takes for a model to learn.

When your model's training is slow or the accuracy is not improving well.

When you want to reduce the chance of the model getting stuck during training.

Syntax

TensorFlow

tf.keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)

axis: The axis that should be normalized (usually the features axis).

momentum: Controls the moving average of mean and variance.

Examples

Default batch normalization layer normalizing over the last axis.

TensorFlow

tf.keras.layers.BatchNormalization()

Normalize over axis 1, useful for some data shapes like sequences.

TensorFlow

tf.keras.layers.BatchNormalization(axis=1)

Custom momentum and epsilon for more control over normalization.

TensorFlow

tf.keras.layers.BatchNormalization(momentum=0.9, epsilon=1e-5)

Sample Model

This example builds a small neural network with batch normalization after the first dense layer. It trains on random data and shows predictions for 5 samples.

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models

# Create a simple model with batch normalization
model = models.Sequential([
    layers.Dense(64, input_shape=(20,)),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Generate some random data
import numpy as np
x_train = np.random.rand(1000, 20).astype('float32')
y_train = (np.sum(x_train, axis=1) > 10).astype('float32')

# Train the model
history = model.fit(x_train, y_train, epochs=3, batch_size=32, verbose=2)

# Make predictions
predictions = model.predict(x_train[:5])

print('Predictions:', predictions.flatten())

OutputSuccess

Important Notes

Batch normalization works best with mini-batches, so very small batch sizes may reduce its effectiveness.

It adds two trainable parameters per feature: one to shift and one to scale the normalized data.

Summary

Batch normalization helps speed up and stabilize training by normalizing layer inputs.

It is added as a layer usually after a dense or convolutional layer and before activation.

It has parameters like momentum and epsilon to control its behavior.