Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What does the nn.GRU layer in PyTorch do?
The nn.GRU layer processes sequences by using Gated Recurrent Units to keep track of information over time, helping models understand order and context in data like sentences or time series.
Click to reveal answer
intermediate
What are the main components inside a GRU cell?
A GRU cell has two gates: the update gate, which decides how much past information to keep, and the reset gate, which decides how to combine new input with past memory.
Click to reveal answer
beginner
How do you create a simple nn.GRU layer in PyTorch for input size 10 and hidden size 20?
Use: nn.GRU(input_size=10, hidden_size=20). This sets the input feature size to 10 and the hidden state size to 20.
Click to reveal answer
intermediate
What is the shape of the output from nn.GRU when batch_first=True and input shape is (batch, seq_len, input_size)?
The output shape is (batch, seq_len, num_directions * hidden_size). It gives the hidden states for each time step in the sequence.
Click to reveal answer
intermediate
Why might you choose GRU over LSTM in a model?
GRUs are simpler and faster to train because they have fewer gates than LSTMs, but still handle sequence data well, making them good for smaller datasets or faster experiments.
Click to reveal answer
What does the update gate in a GRU control?
AHow much past information to keep
BHow to reset the hidden state
CThe input feature size
DThe output sequence length
✗ Incorrect
The update gate controls how much of the past information is kept in the current hidden state.
In PyTorch, what argument makes nn.GRU expect input shape as (batch, seq_len, input_size)?
Adropout=0.5
Bbidirectional=True
Cnum_layers=2
Dbatch_first=True
✗ Incorrect
Setting batch_first=True makes the input and output tensors have batch size as the first dimension.
Which of these is NOT a gate in a GRU cell?
AForget gate
BUpdate gate
CReset gate
DNone of the above
✗ Incorrect
Forget gate is part of LSTM, not GRU.
What is the main advantage of GRU compared to LSTM?
ARequires more memory
BHandles longer sequences better
CSimpler and faster to train
DHas more gates
✗ Incorrect
GRUs have fewer gates, making them simpler and faster to train.
What does the hidden_size parameter in nn.GRU specify?
AThe length of the input sequence
BThe size of the hidden state vector
CThe number of layers
DThe batch size
✗ Incorrect
hidden_size sets the dimension of the hidden state vector in the GRU.
Explain how a GRU layer processes sequence data and why it is useful.
Think about how GRU keeps important information from the past while reading new data.
You got /5 concepts.
Describe how to set up and use an nn.GRU layer in PyTorch including input and output shapes.
Consider the shape of input and output tensors and the key parameters.
You got /5 concepts.
Practice
(1/5)
1. What is the primary purpose of the nn.GRU layer in PyTorch?
easy
A. To reduce the dimensionality of data using PCA
B. To perform image classification using convolution
C. To process sequential data by remembering past information
D. To generate random numbers for initialization
Solution
Step 1: Understand the role of GRU
The GRU (Gated Recurrent Unit) is designed to handle sequences by keeping track of past inputs, which helps in tasks like text or speech processing.
Step 2: Compare with other options
The other options describe unrelated tasks: dimensionality reduction using PCA, image classification using convolution, and random number generation, which are not the purpose of GRU.
Final Answer:
To process sequential data by remembering past information -> Option C
Quick Check:
GRU = sequence memory [OK]
Hint: GRU remembers past inputs in sequences [OK]
Common Mistakes:
Confusing GRU with convolution layers
Thinking GRU reduces data dimensions like PCA
Assuming GRU generates random values
2. Which of the following is the correct way to create a GRU layer with input size 10 and hidden size 20 in PyTorch?
easy
A. nn.GRU(20, 10)
B. nn.GRU(input_size=10, hidden_size=20)
C. nn.GRU(hidden_size=10, input_size=20)
D. nn.GRU(10)
Solution
Step 1: Recall GRU constructor parameters
The correct order and names are input_size first, then hidden_size. So nn.GRU(input_size=10, hidden_size=20) is correct.
Step 2: Check other options
nn.GRU(20, 10) reverses the sizes. nn.GRU(hidden_size=10, input_size=20) swaps parameter names incorrectly. nn.GRU(10) misses the hidden size parameter.
Final Answer:
nn.GRU(input_size=10, hidden_size=20) -> Option B
Quick Check:
Input size first, hidden size second [OK]
Hint: Remember: input_size before hidden_size in nn.GRU [OK]
Common Mistakes:
Swapping input_size and hidden_size
Omitting hidden_size parameter
Using wrong parameter names
3. Given the following code, what is the shape of the output tensor out?
import torch
import torch.nn as nn
gru = nn.GRU(input_size=5, hidden_size=3, batch_first=True)
x = torch.randn(4, 7, 5) # batch=4, seq_len=7, input_size=5
out, h_n = gru(x)
print(out.shape)
medium
A. (4, 7, 3)
B. (7, 4, 3)
C. (4, 3, 7)
D. (7, 3, 4)
Solution
Step 1: Understand batch_first=True effect
With batch_first=True, input shape is (batch, seq_len, input_size). Output shape matches (batch, seq_len, hidden_size).
Step 2: Apply shapes from code
Input is (4, 7, 5), hidden_size=3, so output out shape is (4, 7, 3).
Final Answer:
(4, 7, 3) -> Option A
Quick Check:
Output shape = (batch, seq_len, hidden_size) [OK]
Hint: batch_first=True means batch is first dimension [OK]
Common Mistakes:
Confusing batch and sequence dimensions
Ignoring batch_first parameter
Mixing hidden_size with input_size
4. Which of the following correctly describes the execution of this code snippet?
import torch
import torch.nn as nn
gru = nn.GRU(input_size=8, hidden_size=4)
x = torch.randn(5, 10, 8)
out, h = gru(x)
print(out.shape)
medium
A. The code runs without errors and prints (5, 10, 4)
B. The hidden_size must be larger than input_size
C. The GRU layer requires batch_first=True for this input shape
D. The input tensor shape is incorrect for default GRU settings
Solution
Step 1: Check default GRU input expectations
By default, GRU expects input shape (seq_len, batch, input_size). Here, input is (5, 10, 8), so seq_len=5, batch=10, input_size=8 which matches default.
Step 2: Verify output shape
Output shape will be (seq_len, batch, hidden_size) = (5, 10, 4).
Step 3: Evaluate statements
The code runs without errors and prints (5, 10, 4). Hidden_size can be smaller than input_size. batch_first=True is not required. Input shape is correct for default settings.
Final Answer:
The code runs without errors and prints (5, 10, 4) -> Option A
PyTorch requires sequences in a batch to be the same length or packed. Padding sequences and using pack_padded_sequence allows GRU to ignore padded parts.
Step 2: Evaluate options
Pad sequences to the same length and use pack_padded_sequence before feeding to nn.GRU correctly pads and packs sequences. Feed raw variable-length sequences directly to nn.GRU without padding is invalid because GRU cannot handle raw variable-length sequences. Use nn.GRU with batch_first=False and ignore sequence lengths ignores lengths, causing wrong results. Manually truncate all sequences to the shortest length before input loses data by truncation.
Final Answer:
Pad sequences and use pack_padded_sequence before nn.GRU -> Option D
Quick Check:
Use padding + packing for variable-length sequences [OK]