0
0
PyTorchml~5 mins

Activation functions (ReLU, Sigmoid, Softmax) in PyTorch - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is the purpose of an activation function in a neural network?
An activation function adds non-linearity to the model, allowing it to learn complex patterns and make decisions beyond simple linear relationships.
Click to reveal answer
beginner
Describe the ReLU activation function and its output behavior.
ReLU (Rectified Linear Unit) outputs the input directly if it is positive; otherwise, it outputs zero. It helps models learn faster and reduces the chance of vanishing gradients.
Click to reveal answer
beginner
What is the range of the Sigmoid activation function and why is it useful?
Sigmoid outputs values between 0 and 1, making it useful for models that predict probabilities or binary outcomes.
Click to reveal answer
intermediate
Explain the Softmax activation function and when it is typically used.
Softmax converts a vector of numbers into probabilities that sum to 1. It is commonly used in the output layer for multi-class classification problems.
Click to reveal answer
intermediate
What problem does ReLU help to solve compared to Sigmoid or Tanh?
ReLU helps reduce the vanishing gradient problem by allowing gradients to flow when inputs are positive, unlike Sigmoid or Tanh which can squash gradients to near zero.
Click to reveal answer
Which activation function outputs values strictly between 0 and 1?
ASoftmax
BReLU
CLinear
DSigmoid
What does the ReLU function output when the input is negative?
AThe input value
BNegative of the input
CZero
DOne
Softmax is mainly used in which part of a neural network?
AOutput layer for multi-class classification
BHidden layers
CInput layer
DOutput layer for regression
Which activation function can cause the vanishing gradient problem?
ASigmoid
BReLU
CSoftmax
DLeaky ReLU
What is a key benefit of using ReLU over Sigmoid?
AOutputs probabilities
BNon-linear but faster to compute and reduces vanishing gradients
CAlways outputs positive values
DUsed for multi-class classification
Explain how ReLU, Sigmoid, and Softmax activation functions differ in their output and typical use cases.
Think about the output range and where in the network each is used.
You got /3 concepts.
    Describe why activation functions are important in neural networks and what would happen without them.
    Consider what linear vs non-linear means for learning.
    You got /3 concepts.