0
0
PyTorchml~20 mins

Batch normalization (nn.BatchNorm) in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
BatchNorm Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of BatchNorm1d on a simple tensor
What is the output tensor after applying nn.BatchNorm1d to the input tensor below during training mode?
PyTorch
import torch
import torch.nn as nn

batch_norm = nn.BatchNorm1d(num_features=3)
input_tensor = torch.tensor([[1.0, 2.0, 3.0],
                             [4.0, 5.0, 6.0]], dtype=torch.float32)
output = batch_norm(input_tensor)
print(output)
A[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
B[[ -1.2247, -1.2247, -1.2247], [1.2247, 1.2247, 1.2247]]
C[[ -1.0, -1.0, -1.0], [1.0, 1.0, 1.0]]
D[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]
Attempts:
2 left
💡 Hint
BatchNorm normalizes each feature across the batch to have mean 0 and variance 1 during training.
🧠 Conceptual
intermediate
1:30remaining
Purpose of running mean and variance in BatchNorm
What is the main purpose of the running mean and running variance in nn.BatchNorm layers during training and evaluation?
AThey are used to initialize the weights of the BatchNorm layer.
BThey are only used to compute gradients during backpropagation.
CThey keep a moving average of mean and variance to use during evaluation mode.
DThey store the mean and variance of the current batch only.
Attempts:
2 left
💡 Hint
Think about how BatchNorm behaves differently during training and evaluation.
Hyperparameter
advanced
1:30remaining
Effect of momentum parameter in nn.BatchNorm
In nn.BatchNorm, what effect does increasing the momentum parameter have on the running mean and variance?
AIt makes the running statistics update faster, relying more on the current batch.
BIt makes the running statistics update more slowly, relying more on past values.
CIt disables the running statistics updates entirely.
DIt changes the learning rate of the BatchNorm layer.
Attempts:
2 left
💡 Hint
Momentum controls how much weight is given to new batch statistics versus old running statistics.
🔧 Debug
advanced
2:00remaining
Identifying error in BatchNorm usage
What error will occur when running the following code snippet and why? import torch import torch.nn as nn batch_norm = nn.BatchNorm2d(num_features=3) input_tensor = torch.randn(10, 5, 32, 32) output = batch_norm(input_tensor)
PyTorch
import torch
import torch.nn as nn

batch_norm = nn.BatchNorm2d(num_features=3)
input_tensor = torch.randn(10, 5, 32, 32)
output = batch_norm(input_tensor)
AValueError: num_features must be equal to batch size.
BTypeError: nn.BatchNorm2d does not accept 4D tensors.
CNo error, output tensor shape is (10, 3, 32, 32).
DRuntimeError: Expected 3 channels but got 5 channels in input tensor.
Attempts:
2 left
💡 Hint
Check the channel dimension of the input tensor and the num_features parameter.
Model Choice
expert
2:30remaining
Choosing BatchNorm variant for sequence data
You are building a neural network to process variable-length sequences of word embeddings with shape (batch_size, sequence_length, embedding_dim). Which BatchNorm variant is most appropriate to normalize the embeddings across the batch and sequence length?
Ann.BatchNorm1d applied on embedding_dim dimension after reshaping to (batch_size*sequence_length, embedding_dim).
Bnn.BatchNorm2d applied directly on (batch_size, sequence_length, embedding_dim) tensor.
Cnn.BatchNorm3d applied on the input tensor.
DNo BatchNorm variant is suitable; use LayerNorm instead.
Attempts:
2 left
💡 Hint
Consider how BatchNorm1d expects input shape and what dimension to normalize.