Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What does GRU stand for in machine learning?
GRU stands for Gated Recurrent Unit, a type of neural network layer used to process sequences like text.
Click to reveal answer
beginner
How does a GRU help in understanding text sequences?
A GRU keeps track of important information from earlier words and decides what to remember or forget, helping the model understand context in text.
Click to reveal answer
intermediate
Name the two main gates in a GRU cell and their roles.
The two main gates are the update gate, which decides how much past information to keep, and the reset gate, which decides how to combine new input with past memory.
Click to reveal answer
intermediate
Why might GRUs be preferred over LSTMs for some text tasks?
GRUs are simpler with fewer parameters, making them faster to train and sometimes better for smaller datasets while still capturing sequence information well.
Click to reveal answer
beginner
In a text classification task using a GRU, what is the typical output after the GRU layer?
The typical output is a fixed-size vector summarizing the text, which is then passed to a classifier like a dense layer to predict categories.
Click to reveal answer
What is the main purpose of the update gate in a GRU?
ATo generate the final output directly
BTo decide how much past information to keep
CTo reset the entire memory
DTo increase the learning rate
✗ Incorrect
The update gate controls how much of the past information is kept for the current step.
Which of these is a benefit of using GRUs over traditional RNNs for text?
AThey can remember longer sequences better
BThey require no training
CThey always produce shorter outputs
DThey do not use gates
✗ Incorrect
GRUs use gates to better remember important information over longer sequences compared to basic RNNs.
In text processing, what kind of data does a GRU layer take as input?
ASequences of word or character vectors
BSingle numbers only
CImages
DAudio files
✗ Incorrect
GRUs process sequences, so they take sequences of vectors representing words or characters as input.
What does the reset gate in a GRU do?
ARemoves all past information
BDecides the learning rate
COutputs the final prediction
DControls how to combine new input with past memory
✗ Incorrect
The reset gate decides how much past memory to forget when combining with new input.
Which task is a GRU commonly used for in NLP?
AImage recognition
BSorting numbers
CText classification
DDatabase queries
✗ Incorrect
GRUs are often used for tasks like text classification where understanding sequences is important.
Explain how a GRU processes a sequence of words in a text.
Think about how the GRU decides what to remember and what to forget at each word.
You got /5 concepts.
Describe the advantages of using a GRU over a simple RNN for text tasks.
Focus on how GRUs improve remembering important information in text.
You got /4 concepts.
Practice
(1/5)
1. What is the main advantage of using a GRU (Gated Recurrent Unit) in text processing tasks?
easy
A. It helps the model remember important information over time while ignoring less important details.
B. It increases the size of the input text automatically.
C. It converts text into images for better analysis.
D. It removes all punctuation from the text before processing.
Solution
Step 1: Understand GRU's role in memory
GRU units are designed to keep important information from previous steps and forget irrelevant data, helping with sequence tasks like text.
Step 2: Compare options to GRU function
Only It helps the model remember important information over time while ignoring less important details. correctly describes this memory feature; others describe unrelated or incorrect functions.
Final Answer:
It helps the model remember important information over time while ignoring less important details. -> Option A
Quick Check:
GRU memory feature = A [OK]
Hint: GRU remembers key info, forgets noise in sequences [OK]
Common Mistakes:
Thinking GRU changes input size
Confusing GRU with data preprocessing
Assuming GRU outputs images
2. Which of the following is the correct way to define a GRU layer in Python using PyTorch for text input with embedding size 100 and hidden size 50?
easy
A. nn.GRU(hidden_size=100, input_size=50)
B. nn.GRU(50, 100)
C. nn.GRU(input_size=100, hidden_size=50)
D. nn.GRU(100)
Solution
Step 1: Recall PyTorch GRU parameters
PyTorch GRU expects input_size first (embedding size), then hidden_size (number of features in hidden state).
Step 2: Match parameters to given sizes
Embedding size is 100, hidden size is 50, so nn.GRU(input_size=100, hidden_size=50) is correct.
Final Answer:
nn.GRU(input_size=100, hidden_size=50) -> Option C
Quick Check:
input_size=100, hidden_size=50 = B [OK]
Hint: Input size first, hidden size second in nn.GRU() [OK]
Common Mistakes:
Swapping input_size and hidden_size
Using positional args incorrectly
Omitting required parameters
3. Given the following PyTorch code snippet, what will be the shape of the output tensor after passing input through the GRU?
A. Input size 100 does not match GRU input_size 50
B. Batch size 32 is too large for GRU
C. Sequence length 10 is invalid for GRU
D. GRU requires input to be 2D tensor, not 3D
Solution
Step 1: Check GRU input_size vs input tensor last dimension
GRU expects input_size=50, but input tensor last dimension is 100, causing mismatch.
Step 2: Understand tensor shape requirements
GRU input shape should be (batch, seq_len, input_size). Here input_size dimension must match GRU's input_size parameter.
Final Answer:
Input size 100 does not match GRU input_size 50 -> Option A
Quick Check:
Input size mismatch = C [OK]
Hint: Match input tensor last dim to GRU input_size [OK]
Common Mistakes:
Blaming batch size for error
Thinking sequence length is invalid
Assuming GRU only accepts 2D input
5. You want to build a GRU-based model to classify movie reviews as positive or negative. Your dataset has variable-length reviews. Which approach best handles variable-length sequences with a GRU in PyTorch?
hard
A. Convert text to images and use CNN instead of GRU.
B. Truncate all sequences to length 1 and feed to GRU.
C. Feed raw sequences directly without padding or packing.
D. Pad all sequences to the same length and use pack_padded_sequence before GRU.