Bird
Raised Fist0
TensorFlowml~12 mins

Activation functions (ReLU, sigmoid, softmax) in TensorFlow - Model Pipeline Trace

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Model Pipeline - Activation functions (ReLU, sigmoid, softmax)

This pipeline shows how data moves through a simple neural network using three common activation functions: ReLU, sigmoid, and softmax. These functions help the model learn by adding non-linear behavior and producing probabilities for classification.

Data Flow - 4 Stages
1Input Layer
1 row x 4 columnsRaw input features representing 4 numeric values1 row x 4 columns
[2.0, -1.0, 0.5, 3.0]
2Hidden Layer with ReLU
1 row x 4 columnsLinear transformation (weights and bias) followed by ReLU activation (max(0, x))1 row x 3 columns
Input [2.0, -1.0, 0.5, 3.0] -> Linear output [1.5, -0.5, 2.0] -> ReLU output [1.5, 0.0, 2.0]
3Hidden Layer with Sigmoid
1 row x 3 columnsLinear transformation followed by sigmoid activation (output between 0 and 1)1 row x 3 columns
Input [1.5, 0.0, 2.0] -> Linear output [0.8, -1.2, 0.5] -> Sigmoid output [0.69, 0.23, 0.62]
4Output Layer with Softmax
1 row x 3 columnsLinear transformation followed by softmax activation (outputs sum to 1, representing class probabilities)1 row x 3 columns
Input [0.69, 0.23, 0.62] -> Linear output [2.0, 1.0, 0.1] -> Softmax output [0.66, 0.24, 0.10]
Training Trace - Epoch by Epoch

Loss
1.2 |*       
1.0 | *      
0.8 |  *     
0.6 |   *    
0.4 |    *   
    +---------
     1 2 3 4 5 Epochs
EpochLoss ↓Accuracy ↑Observation
11.20.45Loss starts high, accuracy low as model begins learning
20.90.60Loss decreases, accuracy improves as activations help model learn
30.70.72Model continues to improve with clearer decision boundaries
40.50.80Loss decreases steadily, accuracy rises showing good learning
50.40.85Model converges with lower loss and higher accuracy
Prediction Trace - 4 Layers
Layer 1: Input Layer
Layer 2: Hidden Layer with ReLU
Layer 3: Hidden Layer with Sigmoid
Layer 4: Output Layer with Softmax
Model Quiz - 3 Questions
Test your understanding
What does the ReLU activation function do to negative input values?
AConverts them to probabilities
BSets them to zero
CLeaves them unchanged
DMaps them between 0 and 1
Key Insight
Activation functions like ReLU, sigmoid, and softmax add important non-linear transformations that help neural networks learn complex patterns and produce meaningful outputs such as probabilities for classification.

Practice

(1/5)
1. Which activation function is best suited for hidden layers in a neural network to keep only positive signals?
easy
A. ReLU
B. Sigmoid
C. Softmax
D. Linear

Solution

  1. Step 1: Understand the role of activation functions in hidden layers

    Hidden layers need non-linear functions that allow positive values to pass and block negative ones to help learning complex patterns.
  2. Step 2: Identify which function keeps positive signals

    ReLU (Rectified Linear Unit) outputs zero for negative inputs and passes positive inputs unchanged, making it ideal for hidden layers.
  3. Final Answer:

    ReLU -> Option A
  4. Quick Check:

    Hidden layers use ReLU = C [OK]
Hint: ReLU blocks negatives, perfect for hidden layers [OK]
Common Mistakes:
  • Confusing sigmoid as best for hidden layers
  • Thinking softmax works for hidden layers
  • Assuming linear activation adds non-linearity
2. Which of the following is the correct way to apply the sigmoid activation function in TensorFlow?
easy
A. tf.nn.relu(x)
B. tf.nn.sigmoid(x)
C. tf.sigmoid(x)
D. tf.activation.sigmoid(x)

Solution

  1. Step 1: Recall TensorFlow activation function syntax

    TensorFlow provides activation functions under tf.nn module, so sigmoid is tf.nn.sigmoid.
  2. Step 2: Check each option for correct syntax

    tf.nn.sigmoid(x) uses tf.nn.sigmoid(x), which is the correct function call. Others are invalid or do not exist.
  3. Final Answer:

    tf.nn.sigmoid(x) -> Option B
  4. Quick Check:

    Sigmoid in TensorFlow = tf.nn.sigmoid(x) [OK]
Hint: TensorFlow activations are in tf.nn module [OK]
Common Mistakes:
  • Using tf.sigmoid instead of tf.nn.sigmoid
  • Confusing ReLU with sigmoid function
  • Trying to call activation from tf.activation
3. What will be the output of the following code snippet?
import tensorflow as tf
x = tf.constant([-1.0, 0.0, 1.0, 2.0])
output = tf.nn.relu(x)
print(output.numpy())
medium
A. [0.5 0.5 0.5 0.5]
B. [-1. 0. 1. 2.]
C. [1. 1. 1. 1.]
D. [0. 0. 1. 2.]

Solution

  1. Step 1: Understand ReLU behavior on input tensor

    ReLU outputs zero for negative inputs and passes positive inputs unchanged.
  2. Step 2: Apply ReLU to each element in x

    -1.0 becomes 0.0, 0.0 stays 0.0, 1.0 stays 1.0, 2.0 stays 2.0.
  3. Final Answer:

    [0. 0. 1. 2.] -> Option D
  4. Quick Check:

    ReLU([-1,0,1,2]) = [0,0,1,2] [OK]
Hint: ReLU clips negatives to zero, keeps positives [OK]
Common Mistakes:
  • Expecting negative values to remain
  • Confusing ReLU with sigmoid output
  • Assuming output is all ones
4. Identify the error in the following TensorFlow code that applies softmax activation:
import tensorflow as tf
x = tf.constant([2.0, 1.0, 0.1])
output = tf.nn.softmax(x, axis=1)
print(output.numpy())
medium
A. The axis parameter should be 0 or -1 for this tensor
B. Softmax cannot be applied to 1D tensors
C. The axis parameter should be omitted
D. The axis parameter should be 0 instead of 1

Solution

  1. Step 1: Check the shape of input tensor x

    x is a 1D tensor with shape (3,), so valid axis values are 0 or -1.
  2. Step 2: Understand axis parameter in softmax

    Axis=1 is invalid for 1D tensor because axis 1 does not exist; axis must be 0 or -1.
  3. Final Answer:

    The axis parameter should be 0 or -1 for this tensor -> Option A
  4. Quick Check:

    Softmax axis for 1D tensor = 0 or -1 [OK]
Hint: Axis must exist in tensor shape for softmax [OK]
Common Mistakes:
  • Using axis=1 on 1D tensor causes error
  • Thinking softmax can't apply to 1D tensors
  • Omitting axis but expecting default to work
5. You want to build a neural network for multi-class classification with 4 classes. Which activation function should you use in the output layer to get probabilities for each class?
hard
A. ReLU
B. Sigmoid
C. Softmax
D. Tanh

Solution

  1. Step 1: Understand output layer needs for multi-class classification

    Output layer must produce probabilities that sum to 1 across all classes.
  2. Step 2: Identify activation function that outputs class probabilities

    Softmax converts raw scores into probabilities summing to 1, perfect for multi-class outputs.
  3. Final Answer:

    Softmax -> Option C
  4. Quick Check:

    Multi-class output uses Softmax = B [OK]
Hint: Softmax outputs probabilities summing to 1 [OK]
Common Mistakes:
  • Using sigmoid for multi-class instead of softmax
  • Choosing ReLU which doesn't output probabilities
  • Confusing tanh with probability output