PyTorchml~5 mins

Activation functions (ReLU, Sigmoid, Softmax) in PyTorch

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

Activation functions help a neural network learn complex patterns by adding non-linearity. They decide if a neuron should be activated or not.

When building a neural network to classify images.

When predicting probabilities for binary classification.

When you want to pick the most likely class from many options.

When you want to avoid negative outputs in your model.

When you want to squash outputs between 0 and 1.

Syntax

PyTorch

import torch
import torch.nn.functional as F

# ReLU activation
relu_output = F.relu(input_tensor)

# Sigmoid activation
sigmoid_output = torch.sigmoid(input_tensor)

# Softmax activation
softmax_output = F.softmax(input_tensor, dim=1)

ReLU sets negative values to zero and keeps positive values unchanged.

Softmax is used for multi-class outputs and requires specifying the dimension.

Examples

ReLU turns negative values to zero, so output is [0.0, 0.0, 2.0].

PyTorch

import torch
import torch.nn.functional as F

input_tensor = torch.tensor([-1.0, 0.0, 2.0])
relu_output = F.relu(input_tensor)
print(relu_output)

Sigmoid squashes values between 0 and 1, useful for probabilities.

PyTorch

import torch
input_tensor = torch.tensor([-1.0, 0.0, 2.0])
sigmoid_output = torch.sigmoid(input_tensor)
print(sigmoid_output)

Softmax converts scores to probabilities that sum to 1 across classes.

PyTorch

import torch
import torch.nn.functional as F

input_tensor = torch.tensor([[1.0, 2.0, 3.0]])
softmax_output = F.softmax(input_tensor, dim=1)
print(softmax_output)

Sample Model

This program creates a simple neural network with one layer. It applies ReLU, Sigmoid, and Softmax activations to the layer's output. It prints the results for two input samples.

PyTorch

import torch
import torch.nn as nn
import torch.nn.functional as F

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(4, 3)  # 4 inputs to 3 outputs

    def forward(self, x):
        x = self.fc1(x)
        relu_out = F.relu(x)
        sigmoid_out = torch.sigmoid(x)
        softmax_out = F.softmax(x, dim=1)
        return relu_out, sigmoid_out, softmax_out

# Create model
model = SimpleNet()

# Sample input: batch of 2 samples, each with 4 features
input_data = torch.tensor([[1.0, 2.0, 3.0, 4.0],
                           [4.0, 3.0, 2.0, 1.0]])

relu_out, sigmoid_out, softmax_out = model(input_data)

print("ReLU output:", relu_out)
print("Sigmoid output:", sigmoid_out)
print("Softmax output:", softmax_out)

OutputSuccess

Important Notes

ReLU is simple and fast, but can cause some neurons to stop learning if they output zero all the time.

Sigmoid is good for binary outputs but can slow learning if values saturate near 0 or 1.

Softmax is best for multi-class classification where outputs represent probabilities.

Summary

Activation functions add non-linearity so neural networks can learn complex patterns.

ReLU sets negative values to zero, Sigmoid squashes values between 0 and 1, Softmax turns scores into probabilities.

Choose activation based on your task: ReLU for hidden layers, Sigmoid for binary output, Softmax for multi-class output.