0
0
PyTorchml~20 mins

Hidden state management in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Hidden State Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the shape of the hidden state after one forward pass?

Consider the following PyTorch code snippet using an LSTM layer. What will be the shape of the hidden state h_n after running the forward pass?

PyTorch
import torch
import torch.nn as nn

lstm = nn.LSTM(input_size=10, hidden_size=20, num_layers=2)
inputs = torch.randn(5, 3, 10)  # seq_len=5, batch=3, input_size=10
output, (h_n, c_n) = lstm(inputs)

# What is h_n.shape?
A(3, 20)
B(5, 3, 20)
C(2, 3, 20)
D(3, 2, 20)
Attempts:
2 left
💡 Hint

Remember the hidden state shape is (num_layers, batch_size, hidden_size).

🧠 Conceptual
intermediate
1:30remaining
Why do we detach hidden states in RNN training?

In training recurrent neural networks, why is it important to detach the hidden state from the computation graph between batches?

ATo convert the hidden state to a numpy array
BTo increase the size of the hidden state tensor
CTo reset the model weights to initial values
DTo prevent backpropagation through the entire history and save memory
Attempts:
2 left
💡 Hint

Think about how backpropagation works through time in RNNs.

🔧 Debug
advanced
2:30remaining
Identify the error in hidden state initialization

What is wrong with the following code snippet for initializing hidden states in a GRU model?

PyTorch
import torch
import torch.nn as nn

gru = nn.GRU(input_size=8, hidden_size=16, num_layers=1)
batch_size = 4
h0 = torch.zeros(1, batch_size, 16)  # Intended hidden state
inputs = torch.randn(10, batch_size, 8)
output, hn = gru(inputs, h0)
Ah0 should have shape (num_layers, batch_size, hidden_size), not (batch_size, hidden_size)
Bh0 should be initialized with ones, not zeros
Cinputs should have batch_size as first dimension, not second
DGRU requires hidden state to be on CPU, not default device
Attempts:
2 left
💡 Hint

Check the expected shape of the initial hidden state for GRU layers.

Hyperparameter
advanced
1:30remaining
Effect of hidden state size on model capacity

Increasing the hidden state size in an RNN model primarily affects which of the following?

AThe number of input features processed per time step
BThe model's ability to capture more complex patterns by increasing capacity
CThe length of input sequences the model can handle
DThe batch size used during training
Attempts:
2 left
💡 Hint

Think about what hidden state size controls in an RNN.

Metrics
expert
2:00remaining
Interpreting hidden state outputs for sequence classification

In a sequence classification task using an LSTM, which hidden state output is typically used as the representation for the entire sequence?

AThe last hidden state of the top LSTM layer (h_n at last time step)
BThe average of all hidden states across all time steps
CThe first hidden state of the bottom LSTM layer
DThe cell state (c_n) of the first LSTM layer
Attempts:
2 left
💡 Hint

Consider which hidden state summarizes the sequence information after processing.