0
0
PyTorchml~20 mins

nn.GRU layer in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
GRU Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output shape of nn.GRU layer
Given the following PyTorch code, what is the shape of the output tensor out after running the GRU layer?
PyTorch
import torch
import torch.nn as nn

gru = nn.GRU(input_size=10, hidden_size=20, num_layers=2)
input_tensor = torch.randn(5, 3, 10)  # (seq_len, batch, input_size)
out, h_n = gru(input_tensor)
print(out.shape)
Atorch.Size([3, 20, 5])
Btorch.Size([3, 5, 20])
Ctorch.Size([5, 20, 3])
Dtorch.Size([5, 3, 20])
Attempts:
2 left
💡 Hint
Remember the GRU output shape is (seq_len, batch, hidden_size).
Model Choice
intermediate
2:00remaining
Choosing GRU for sequence data
You want to build a model to predict the next word in a sentence using sequential text data. Which of the following is the best reason to choose an nn.GRU layer over a simple RNN layer?
AGRU can better capture long-term dependencies with fewer parameters than a simple RNN.
BGRU layers are faster to train because they use convolution operations.
CSimple RNNs have built-in attention mechanisms, but GRUs do not.
DGRU layers require the input to be one-hot encoded, unlike simple RNNs.
Attempts:
2 left
💡 Hint
Think about how GRUs handle memory compared to simple RNNs.
Hyperparameter
advanced
2:00remaining
Effect of num_layers in nn.GRU
What is the effect of increasing the num_layers parameter in an nn.GRU layer from 1 to 3?
AThe GRU will have 3 stacked layers, allowing it to learn more complex features from the sequence.
BThe GRU will process the input sequence 3 times in parallel and average the results.
CThe GRU will increase the hidden size by 3 times automatically.
DThe GRU will reduce the sequence length by a factor of 3.
Attempts:
2 left
💡 Hint
Think about what stacking layers means in neural networks.
Metrics
advanced
2:00remaining
Interpreting GRU training loss
During training of a GRU-based model for time series prediction, the training loss decreases steadily but the validation loss starts increasing after some epochs. What does this indicate?
AThe GRU layer is not suitable for time series data.
BThe model is underfitting and needs more training epochs.
CThe model is overfitting the training data and not generalizing well.
DThe learning rate is too low, causing slow convergence.
Attempts:
2 left
💡 Hint
Think about what it means when validation loss rises but training loss falls.
🔧 Debug
expert
2:00remaining
Identifying error in GRU input shape
You run this code and get a runtime error. What is the cause?
import torch
import torch.nn as nn

gru = nn.GRU(input_size=8, hidden_size=16)
input_tensor = torch.randn(4, 10)  # Missing batch dimension
out, h_n = gru(input_tensor)
AHidden size must be equal to input size, but here they differ.
BInput tensor must be 3D with shape (seq_len, batch, input_size), but input_tensor is 2D.
CGRU requires input tensor to be on GPU but input_tensor is on CPU.
DThe batch size must be the first dimension, but here it is the second.
Attempts:
2 left
💡 Hint
Check the expected input shape for nn.GRU layers.