Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What does the nn.LSTM layer in PyTorch do?
The nn.LSTM layer processes sequences of data by remembering information over time. It helps models learn patterns in sequences like sentences or time series.
Click to reveal answer
beginner
What are the main inputs and outputs of an nn.LSTM layer?
Input: a sequence of data with shape (sequence_length, batch_size, input_size). Output: the hidden states for each time step and the final hidden and cell states.
Click to reveal answer
intermediate
Why does nn.LSTM have both hidden state and cell state?
The hidden state carries short-term memory, while the cell state carries long-term memory. This helps the LSTM remember important information over many steps.
Click to reveal answer
beginner
How do you initialize an nn.LSTM layer for input size 10 and hidden size 20?
Use nn.LSTM(input_size=10, hidden_size=20). This sets the input feature size to 10 and the hidden layer size to 20.
Click to reveal answer
intermediate
What does setting batch_first=True do in nn.LSTM?
It changes the input and output shape to (batch_size, sequence_length, input_size), which can be easier to work with when batches come first.
Click to reveal answer
What shape does nn.LSTM expect for its input by default?
A(input_size, sequence_length, batch_size)
B(batch_size, input_size, sequence_length)
C(batch_size, sequence_length, input_size)
D(sequence_length, batch_size, input_size)
✗ Incorrect
By default, nn.LSTM expects input shaped as (sequence_length, batch_size, input_size).
What are the two states returned by nn.LSTM besides the output?
Ahidden state and cell state
Binput state and output state
Cweight state and bias state
Dactivation state and dropout state
✗ Incorrect
nn.LSTM returns hidden state and cell state to keep track of short-term and long-term memory.
What does the hidden_size parameter control in nn.LSTM?
AThe batch size
BThe number of features in the hidden state
CThe length of the input sequence
DThe number of layers
✗ Incorrect
hidden_size sets how many features the hidden state will have.
If batch_first=True, what is the input shape for nn.LSTM?
A(batch_size, input_size, sequence_length)
B(sequence_length, batch_size, input_size)
C(batch_size, sequence_length, input_size)
D(input_size, batch_size, sequence_length)
✗ Incorrect
Setting batch_first=True changes input shape to (batch_size, sequence_length, input_size).
Why is nn.LSTM better than a simple RNN for long sequences?
ABecause it can remember information longer using cell state
BBecause it uses convolution layers
CBecause it has fewer parameters
DBecause it does not use activation functions
✗ Incorrect
nn.LSTM uses a cell state to keep long-term memory, helping with long sequences.
Explain how nn.LSTM processes a sequence of data step-by-step.
Think about how information flows through time steps and how memory is kept.
You got /4 concepts.
Describe the difference between hidden state and cell state in nn.LSTM and why both are important.
Consider how remembering recent vs. older information helps understanding sequences.
You got /3 concepts.
Practice
(1/5)
1. What is the primary purpose of the nn.LSTM layer in PyTorch?
easy
A. To process and remember information from sequences over time
B. To perform image classification using convolution
C. To reduce the dimensionality of data using PCA
D. To generate random numbers for initialization
Solution
Step 1: Understand the role of LSTM
LSTM stands for Long Short-Term Memory, a type of recurrent neural network layer designed to handle sequence data and remember information over time.
Step 2: Match purpose with options
Among the options, only processing and remembering sequence information matches the LSTM's purpose.
Final Answer:
To process and remember information from sequences over time -> Option A
Quick Check:
LSTM purpose = sequence memory [OK]
Hint: LSTM = sequence memory layer, not image or random [OK]
Common Mistakes:
Confusing LSTM with convolutional layers
Thinking LSTM reduces data dimension like PCA
Assuming LSTM generates random numbers
2. Which of the following is the correct way to create an LSTM layer in PyTorch with input size 10 and hidden size 20?
easy
A. nn.LSTM(input=10, hidden=20)
B. nn.LSTM(20, 10)
C. nn.LSTM(10, 20)
D. nn.LSTM(hidden_size=10, input_size=20)
Solution
Step 1: Recall nn.LSTM constructor parameters
The first argument is input_size (features per input), the second is hidden_size (features in hidden state).
Step 2: Match correct syntax
nn.LSTM(10, 20) uses nn.LSTM(10, 20) which correctly sets input_size=10 and hidden_size=20.
Final Answer:
nn.LSTM(10, 20) -> Option C
Quick Check:
Constructor order = input_size, hidden_size [OK]
Hint: First arg input size, second hidden size in nn.LSTM() [OK]
Common Mistakes:
Swapping input_size and hidden_size
Using wrong keyword arguments
Confusing parameter names
3. Given the code below, what is the shape of output after running the LSTM?
4. What is wrong with this code snippet that tries to create an LSTM layer?
import torch.nn as nn
lstm = nn.LSTM(10)
medium
A. The input size must be a tuple, not an integer
B. It misses the hidden_size argument, causing an error
C. LSTM requires a batch size argument at creation
D. The code is correct and runs without error
Solution
Step 1: Check nn.LSTM constructor requirements
nn.LSTM requires at least two positional arguments: input_size and hidden_size.
Step 2: Identify missing argument
The code only provides input_size=10, missing hidden_size, so it will raise a TypeError.
Final Answer:
It misses the hidden_size argument, causing an error -> Option B
Quick Check:
nn.LSTM needs input_size and hidden_size [OK]
Hint: nn.LSTM needs two sizes: input and hidden [OK]
Common Mistakes:
Thinking batch size is needed at layer creation
Assuming input_size can be a tuple
Believing code runs without error
5. You want to build a model that processes sequences of length 6 with 8 features each. You want the LSTM to output a sequence with 12 features per time step. Which of the following LSTM layer initializations is correct to achieve this?
hard
A. nn.LSTM(input_size=12, hidden_size=8)
B. nn.LSTM(input_size=8, hidden_size=6)
C. nn.LSTM(input_size=6, hidden_size=8)
D. nn.LSTM(input_size=8, hidden_size=12)
Solution
Step 1: Identify input_size and hidden_size meanings
input_size is the number of features per time step in the input sequence. hidden_size is the number of features in the output per time step.
Step 2: Match given sequence and desired output
Input sequences have 8 features, so input_size=8. Desired output features per time step is 12, so hidden_size=12.