Challenge - 5 Problems
GRU Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of GRU layer with simple input
What is the shape of the output tensor after passing a batch of 2 sequences, each of length 3 with 4 features, through a GRU layer with 5 units and return_sequences=True?
NLP
import torch import torch.nn as nn gru = nn.GRU(input_size=4, hidden_size=5, batch_first=True, bidirectional=False) input_tensor = torch.randn(2, 3, 4) # batch=2, seq_len=3, features=4 output, hidden = gru(input_tensor) print(output.shape)
Attempts:
2 left
💡 Hint
Remember that batch_first=True means batch size is the first dimension in input and output.
✗ Incorrect
The GRU layer outputs a tensor with shape (batch_size, sequence_length, hidden_size) when return_sequences=True. Here, batch_size=2, sequence_length=3, hidden_size=5.
🧠 Conceptual
intermediate1:30remaining
Purpose of the reset gate in GRU
What is the main role of the reset gate in a GRU cell when processing text sequences?
Attempts:
2 left
💡 Hint
Think about which gate helps the model reset memory for new inputs.
✗ Incorrect
The reset gate controls how much of the previous hidden state to forget when computing the candidate activation, allowing the model to reset memory for new inputs.
❓ Hyperparameter
advanced1:30remaining
Choosing GRU hidden size for text classification
You want to build a GRU-based text classifier. Which hidden size choice is most likely to balance model capacity and training speed for a medium-sized dataset?
Attempts:
2 left
💡 Hint
Too small hidden size limits learning; too large slows training and risks overfitting.
✗ Incorrect
A hidden size of 128 is a common good balance for medium datasets, providing enough capacity without excessive computation or overfitting risk.
❓ Metrics
advanced2:00remaining
Evaluating GRU model performance on text data
After training a GRU model for sentiment analysis, you get these results on the test set: accuracy=0.85, precision=0.60, recall=0.95. What does this tell you about the model's predictions?
Attempts:
2 left
💡 Hint
High recall but low precision means many false alarms.
✗ Incorrect
High recall (0.95) means the model finds most positive cases, but low precision (0.60) means many predicted positives are wrong (false positives).
🔧 Debug
expert2:30remaining
Identifying error in GRU input shape
You run this PyTorch code and get a runtime error:
import torch
import torch.nn as nn
gru = nn.GRU(input_size=10, hidden_size=20, batch_first=True)
input_tensor = torch.randn(5, 20, 10)
output, hidden = gru(input_tensor)
What is the cause of the error?
NLP
import torch import torch.nn as nn gru = nn.GRU(input_size=10, hidden_size=20, batch_first=True) input_tensor = torch.randn(5, 20, 10) output, hidden = gru(input_tensor)
Attempts:
2 left
💡 Hint
Check the meaning of each dimension when batch_first=True.
✗ Incorrect
With batch_first=True, input shape should be (batch_size, seq_len, input_size). Here, input_tensor shape is (5, 20, 10), which is correct. But if the model expects input_size=10, this matches. So no error from shape. However, if the error occurs, likely the batch and sequence length are swapped in the data preparation. The question is tricky but the correct answer is that the input shape is incorrect because batch and sequence length are swapped.