0
0
PyTorchml~5 mins

nn.LSTM layer in PyTorch

Choose your learning style9 modes available
Introduction
An LSTM layer helps a model remember important information from sequences, like sentences or time series, so it can make better predictions.
When you want to predict the next word in a sentence.
When analyzing time-based data like stock prices or weather.
When processing audio signals for speech recognition.
When working with sequences of events in a game or robot control.
Syntax
PyTorch
torch.nn.LSTM(input_size, hidden_size, num_layers=1, batch_first=False, dropout=0, bidirectional=False)
input_size is the number of features in each input step.
hidden_size is how many features the LSTM will output at each step.
Examples
Creates a single-layer LSTM that takes inputs with 10 features and outputs 20 features.
PyTorch
lstm = torch.nn.LSTM(input_size=10, hidden_size=20)
Creates a 2-layer LSTM where input and output tensors have batch size first.
PyTorch
lstm = torch.nn.LSTM(input_size=5, hidden_size=15, num_layers=2, batch_first=True)
Creates a bidirectional LSTM that reads sequences forwards and backwards.
PyTorch
lstm = torch.nn.LSTM(input_size=8, hidden_size=16, bidirectional=True)
Sample Model
This code creates a simple LSTM layer and passes a batch of two sequences through it. It prints the shapes of the output and hidden states to show how data flows.
PyTorch
import torch
import torch.nn as nn

# Create an LSTM layer
lstm = nn.LSTM(input_size=3, hidden_size=5, num_layers=1, batch_first=True)

# Example input: batch of 2 sequences, each with 4 time steps, each step has 3 features
input_seq = torch.randn(2, 4, 3)

# Forward pass through LSTM
output, (hn, cn) = lstm(input_seq)

print('Output shape:', output.shape)
print('Hidden state shape:', hn.shape)
print('Cell state shape:', cn.shape)
OutputSuccess
Important Notes
The output tensor contains the LSTM output for each time step in the sequence.
The hidden state (hn) and cell state (cn) hold the LSTM's memory after processing the sequence.
Setting batch_first=True means your input shape should be (batch_size, sequence_length, features).
Summary
LSTM layers help models remember information from sequences.
You set input size and hidden size to control input features and output features.
Outputs include the sequence output and the final hidden and cell states.