Recall & Review

beginner

What is a hidden state in recurrent neural networks (RNNs)?

A hidden state is a memory that stores information from previous inputs in a sequence. It helps the RNN remember past data to influence future predictions.

Click to reveal answer

intermediate

Why do we need to manage hidden states carefully during training in PyTorch?

Because hidden states carry information across time steps, improper management can cause errors like backpropagating through the entire history, leading to high memory use and slow training.

Click to reveal answer

intermediate

What does the method detach() do when applied to a hidden state tensor in PyTorch?

detach() stops the hidden state from tracking gradients backward beyond the current step. This prevents backpropagation through the entire sequence history.

Click to reveal answer

beginner

How do you initialize a hidden state for an RNN in PyTorch?

You create a tensor of zeros with the shape (number_of_layers, batch_size, hidden_size) and set requires_grad=False. This tensor is passed as the initial hidden state.

Click to reveal answer

intermediate

What is the difference between hidden state and cell state in LSTM networks?

The hidden state carries output information, while the cell state carries long-term memory. Both work together to help LSTM remember and forget information.

Click to reveal answer

What is the main purpose of the hidden state in an RNN?

ATo initialize weights

BTo store information from previous inputs

CTo compute loss

DTo normalize inputs

In PyTorch, what does calling hidden_state.detach() do?

APrevents gradients from flowing back beyond this point

BDeletes the hidden state

CResets the hidden state to zero

DCopies the hidden state to CPU

How should you initialize the hidden state for an RNN in PyTorch?

AOnes with shape (input_size, hidden_size)

BRandom values with shape (batch, input_size)

CA zero tensor with shape (layers, batch, hidden_size)

DA scalar zero

What happens if you do not detach the hidden state during training?

AThe hidden state will reset automatically

BThe model will not train

CThe model will ignore the hidden state

DBackpropagation will go through all previous time steps, increasing memory use

In LSTM, what is the role of the cell state compared to the hidden state?

ACell state carries long-term memory; hidden state carries output

BCell state is the input; hidden state is the output

CCell state is always zero; hidden state changes

DThey are the same

Explain how hidden states are managed during training of an RNN in PyTorch and why detaching is important.

Describe the difference between hidden state and cell state in LSTM networks and their roles.

Practice

(1/5)

1. What is the main purpose of the hidden state in a PyTorch RNN model?

easy

A. To store information from previous time steps in a sequence

B. To initialize the model weights randomly

C. To store the final output of the model

D. To reset the model after each batch

Hidden state management in PyTorch - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of hidden state in sequence models

Step 2: Differentiate hidden state from other components

Final Answer:

Quick Check:

Solution

Step 1: Recall RNN hidden state shape requirements

Step 2: Match options to correct shape

Final Answer:

Quick Check:

Solution

Step 1: Understand RNN output shape with batch_first=True

Step 2: Match output shape to options

Final Answer:

Quick Check:

Solution

Step 1: Check input and hidden state shapes

Step 2: Identify mismatch in batch size

Final Answer:

Quick Check:

Solution

Step 1: Understand hidden state persistence across batches

Step 2: Avoid backpropagation through entire history

Final Answer:

Quick Check: