The nn.GRU layer helps a model remember information from sequences, like words in a sentence, to make better predictions.
0
0
nn.GRU layer in PyTorch
Introduction
When you want to understand sentences or speech over time.
When predicting the next word in a sentence.
When analyzing time series data like stock prices.
When building chatbots that remember past messages.
When processing sequences of sensor data for predictions.
Syntax
PyTorch
torch.nn.GRU(input_size, hidden_size, num_layers=1, batch_first=False, dropout=0, bidirectional=False)
input_size: Number of features in each input step.
hidden_size: Number of features in the hidden state (memory size).
Examples
Creates a GRU layer that takes inputs with 10 features and outputs hidden states with 20 features.
PyTorch
gru = torch.nn.GRU(input_size=10, hidden_size=20)
Creates a 2-layer GRU where input and output tensors have batch size as the first dimension.
PyTorch
gru = torch.nn.GRU(input_size=5, hidden_size=15, num_layers=2, batch_first=True)
Sample Model
This code creates a simple GRU layer and passes a random input through it. It prints the shapes and values of the output and hidden state tensors.
PyTorch
import torch import torch.nn as nn # Create a GRU layer input_size = 3 hidden_size = 5 num_layers = 1 batch_size = 2 seq_length = 4 gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True) # Random input: batch_size sequences, each with seq_length steps, each step with input_size features input_tensor = torch.randn(batch_size, seq_length, input_size) # Initial hidden state (num_layers, batch_size, hidden_size) h0 = torch.zeros(num_layers, batch_size, hidden_size) # Forward pass through GRU output, hn = gru(input_tensor, h0) print("Output shape:", output.shape) print("Hidden state shape:", hn.shape) print("Output tensor:", output) print("Hidden state tensor:", hn)
OutputSuccess
Important Notes
The GRU remembers information from previous steps to help with sequence data.
Setting batch_first=True makes input and output shapes easier to work with (batch size first).
You can stack multiple GRU layers by increasing num_layers.
Summary
The nn.GRU layer helps models learn from sequences by keeping memory of past inputs.
It takes input size and hidden size as main settings.
Use it when working with time or sequence data like text, speech, or sensor readings.