Challenge - 5 Problems
RNN Text Generation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Understanding RNN hidden state behavior
In an RNN used for text generation, what does the hidden state represent during training?
Attempts:
2 left
💡 Hint
Think about how RNNs remember context from earlier inputs.
✗ Incorrect
The hidden state in an RNN acts like a memory that carries information from previous inputs, helping the model understand context when predicting the next character or word.
❓ Predict Output
intermediate2:00remaining
Output shape of RNN layer in text generation
What is the shape of the output tensor from an RNN layer when processing a batch of sequences with shape (batch_size=4, sequence_length=10, input_dim=8) and hidden size 16?
NLP
import torch import torch.nn as nn rnn = nn.RNN(input_size=8, hidden_size=16, batch_first=True) inputs = torch.randn(4, 10, 8) output, hidden = rnn(inputs) print(output.shape)
Attempts:
2 left
💡 Hint
Remember batch_first=True means batch is the first dimension.
✗ Incorrect
The RNN output has shape (batch_size, sequence_length, hidden_size) when batch_first=True. Here, batch_size=4, sequence_length=10, hidden_size=16.
❓ Hyperparameter
advanced2:00remaining
Choosing sequence length for training RNN text generation
Which sequence length is generally better for training an RNN text generator to capture long-term dependencies without causing too much memory use?
Attempts:
2 left
💡 Hint
Think about the trade-off between context and computational resources.
✗ Incorrect
Moderate sequence lengths allow the RNN to learn useful context without excessive memory use or vanishing gradients that happen with very long sequences.
❓ Metrics
advanced2:00remaining
Evaluating RNN text generation with perplexity
What does a lower perplexity score indicate when evaluating an RNN text generation model?
Attempts:
2 left
💡 Hint
Perplexity measures how well the model predicts the next token.
✗ Incorrect
Lower perplexity means the model assigns higher probabilities to the correct next tokens, showing better prediction performance.
🔧 Debug
expert3:00remaining
Identifying cause of exploding gradients in RNN training
During training an RNN for text generation, the loss suddenly becomes NaN and the model weights explode. Which code snippet is the most likely cause?
NLP
optimizer = torch.optim.Adam(model.parameters(), lr=1.0) for inputs, targets in dataloader: optimizer.zero_grad() outputs = model(inputs) loss = loss_fn(outputs, targets) loss.backward() optimizer.step()
Attempts:
2 left
💡 Hint
Check the learning rate value and its effect on training stability.
✗ Incorrect
A very high learning rate like 1.0 can cause large weight updates, leading to exploding gradients and NaN loss values.