Bird
Raised Fist0
TensorFlowml~5 mins

Activation functions (ReLU, sigmoid, softmax) in TensorFlow - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What does the ReLU activation function do to input values?
ReLU (Rectified Linear Unit) outputs the input directly if it is positive; otherwise, it outputs zero. It helps models learn faster by allowing only positive signals to pass.
Click to reveal answer
beginner
Describe the sigmoid activation function and its output range.
The sigmoid function squashes input values into a range between 0 and 1, making it useful for probabilities in binary classification tasks.
Click to reveal answer
intermediate
What is the purpose of the softmax activation function in neural networks?
Softmax converts a vector of raw scores into probabilities that sum to 1, often used in the output layer for multi-class classification.
Click to reveal answer
intermediate
Why is ReLU preferred over sigmoid in hidden layers of deep networks?
ReLU avoids the vanishing gradient problem by not saturating for positive inputs, allowing faster and more effective training compared to sigmoid.
Click to reveal answer
intermediate
How does softmax ensure the outputs can be interpreted as probabilities?
Softmax exponentiates each input and divides by the sum of all exponentiated inputs, ensuring all outputs are positive and sum to 1.
Click to reveal answer
Which activation function outputs zero for all negative inputs?
ASoftmax
BReLU
CSigmoid
DTanh
What is the output range of the sigmoid activation function?
A-infinity to infinity
B-1 to 1
C0 to infinity
D0 to 1
Which activation function is best suited for multi-class classification output layers?
ASoftmax
BReLU
CSigmoid
DLinear
Why might sigmoid activation slow down training in deep networks?
AIt causes vanishing gradients
BIt is not differentiable
CIt outputs negative values only
DIt outputs only zeros
How does softmax transform its input vector?
ABy applying ReLU element-wise
BBy normalizing inputs to sum to zero
CBy exponentiating inputs and normalizing to sum to one
DBy clipping inputs between 0 and 1
Explain the differences between ReLU, sigmoid, and softmax activation functions and when to use each.
Think about output ranges and typical use cases in neural networks.
You got /6 concepts.
    Describe why ReLU helps avoid the vanishing gradient problem compared to sigmoid.
    Consider how gradients behave during backpropagation.
    You got /4 concepts.

      Practice

      (1/5)
      1. Which activation function is best suited for hidden layers in a neural network to keep only positive signals?
      easy
      A. ReLU
      B. Sigmoid
      C. Softmax
      D. Linear

      Solution

      1. Step 1: Understand the role of activation functions in hidden layers

        Hidden layers need non-linear functions that allow positive values to pass and block negative ones to help learning complex patterns.
      2. Step 2: Identify which function keeps positive signals

        ReLU (Rectified Linear Unit) outputs zero for negative inputs and passes positive inputs unchanged, making it ideal for hidden layers.
      3. Final Answer:

        ReLU -> Option A
      4. Quick Check:

        Hidden layers use ReLU = C [OK]
      Hint: ReLU blocks negatives, perfect for hidden layers [OK]
      Common Mistakes:
      • Confusing sigmoid as best for hidden layers
      • Thinking softmax works for hidden layers
      • Assuming linear activation adds non-linearity
      2. Which of the following is the correct way to apply the sigmoid activation function in TensorFlow?
      easy
      A. tf.nn.relu(x)
      B. tf.nn.sigmoid(x)
      C. tf.sigmoid(x)
      D. tf.activation.sigmoid(x)

      Solution

      1. Step 1: Recall TensorFlow activation function syntax

        TensorFlow provides activation functions under tf.nn module, so sigmoid is tf.nn.sigmoid.
      2. Step 2: Check each option for correct syntax

        tf.nn.sigmoid(x) uses tf.nn.sigmoid(x), which is the correct function call. Others are invalid or do not exist.
      3. Final Answer:

        tf.nn.sigmoid(x) -> Option B
      4. Quick Check:

        Sigmoid in TensorFlow = tf.nn.sigmoid(x) [OK]
      Hint: TensorFlow activations are in tf.nn module [OK]
      Common Mistakes:
      • Using tf.sigmoid instead of tf.nn.sigmoid
      • Confusing ReLU with sigmoid function
      • Trying to call activation from tf.activation
      3. What will be the output of the following code snippet?
      import tensorflow as tf
      x = tf.constant([-1.0, 0.0, 1.0, 2.0])
      output = tf.nn.relu(x)
      print(output.numpy())
      medium
      A. [0.5 0.5 0.5 0.5]
      B. [-1. 0. 1. 2.]
      C. [1. 1. 1. 1.]
      D. [0. 0. 1. 2.]

      Solution

      1. Step 1: Understand ReLU behavior on input tensor

        ReLU outputs zero for negative inputs and passes positive inputs unchanged.
      2. Step 2: Apply ReLU to each element in x

        -1.0 becomes 0.0, 0.0 stays 0.0, 1.0 stays 1.0, 2.0 stays 2.0.
      3. Final Answer:

        [0. 0. 1. 2.] -> Option D
      4. Quick Check:

        ReLU([-1,0,1,2]) = [0,0,1,2] [OK]
      Hint: ReLU clips negatives to zero, keeps positives [OK]
      Common Mistakes:
      • Expecting negative values to remain
      • Confusing ReLU with sigmoid output
      • Assuming output is all ones
      4. Identify the error in the following TensorFlow code that applies softmax activation:
      import tensorflow as tf
      x = tf.constant([2.0, 1.0, 0.1])
      output = tf.nn.softmax(x, axis=1)
      print(output.numpy())
      medium
      A. The axis parameter should be 0 or -1 for this tensor
      B. Softmax cannot be applied to 1D tensors
      C. The axis parameter should be omitted
      D. The axis parameter should be 0 instead of 1

      Solution

      1. Step 1: Check the shape of input tensor x

        x is a 1D tensor with shape (3,), so valid axis values are 0 or -1.
      2. Step 2: Understand axis parameter in softmax

        Axis=1 is invalid for 1D tensor because axis 1 does not exist; axis must be 0 or -1.
      3. Final Answer:

        The axis parameter should be 0 or -1 for this tensor -> Option A
      4. Quick Check:

        Softmax axis for 1D tensor = 0 or -1 [OK]
      Hint: Axis must exist in tensor shape for softmax [OK]
      Common Mistakes:
      • Using axis=1 on 1D tensor causes error
      • Thinking softmax can't apply to 1D tensors
      • Omitting axis but expecting default to work
      5. You want to build a neural network for multi-class classification with 4 classes. Which activation function should you use in the output layer to get probabilities for each class?
      hard
      A. ReLU
      B. Sigmoid
      C. Softmax
      D. Tanh

      Solution

      1. Step 1: Understand output layer needs for multi-class classification

        Output layer must produce probabilities that sum to 1 across all classes.
      2. Step 2: Identify activation function that outputs class probabilities

        Softmax converts raw scores into probabilities summing to 1, perfect for multi-class outputs.
      3. Final Answer:

        Softmax -> Option C
      4. Quick Check:

        Multi-class output uses Softmax = B [OK]
      Hint: Softmax outputs probabilities summing to 1 [OK]
      Common Mistakes:
      • Using sigmoid for multi-class instead of softmax
      • Choosing ReLU which doesn't output probabilities
      • Confusing tanh with probability output