Activation functions help a neural network learn complex patterns by deciding which signals to pass forward. They add non-linearity so the model can solve real-world problems.
Activation functions (ReLU, sigmoid, softmax) in TensorFlow
Start learning this pattern below
Jump into concepts and practice - no test required
tf.keras.layers.Activation('relu') tf.keras.layers.Activation('sigmoid') tf.keras.layers.Activation('softmax')
ReLU outputs zero for negative inputs and the input itself if positive.
Sigmoid squashes input values between 0 and 1, useful for binary outputs.
layer = tf.keras.layers.Dense(10, activation='relu')
layer = tf.keras.layers.Dense(1, activation='sigmoid')
layer = tf.keras.layers.Dense(3, activation='softmax')
This code shows how ReLU, sigmoid, and softmax activation functions transform the same input array. It prints the input and the outputs after applying each activation.
import tensorflow as tf import numpy as np # Sample input data x = np.array([[-1.0, 0.0, 1.0, 2.0]]) # Define layers with different activations relu_layer = tf.keras.layers.Activation('relu') sigmoid_layer = tf.keras.layers.Activation('sigmoid') softmax_layer = tf.keras.layers.Activation('softmax') # Apply activations relu_output = relu_layer(x) sigmoid_output = sigmoid_layer(x) softmax_output = softmax_layer(x) print('Input:', x) print('ReLU output:', relu_output.numpy()) print('Sigmoid output:', sigmoid_output.numpy()) print('Softmax output:', softmax_output.numpy())
ReLU is simple and fast but can cause some neurons to 'die' if they always output zero.
Sigmoid outputs can saturate near 0 or 1, which slows learning for deep networks.
Softmax is used only in the output layer for multi-class classification to get probabilities.
Activation functions add non-linearity to neural networks.
ReLU is good for hidden layers to keep positive signals.
Sigmoid and softmax are used for output layers to get probabilities.
Practice
Solution
Step 1: Understand the role of activation functions in hidden layers
Hidden layers need non-linear functions that allow positive values to pass and block negative ones to help learning complex patterns.Step 2: Identify which function keeps positive signals
ReLU (Rectified Linear Unit) outputs zero for negative inputs and passes positive inputs unchanged, making it ideal for hidden layers.Final Answer:
ReLU -> Option AQuick Check:
Hidden layers use ReLU = C [OK]
- Confusing sigmoid as best for hidden layers
- Thinking softmax works for hidden layers
- Assuming linear activation adds non-linearity
Solution
Step 1: Recall TensorFlow activation function syntax
TensorFlow provides activation functions under tf.nn module, so sigmoid is tf.nn.sigmoid.Step 2: Check each option for correct syntax
tf.nn.sigmoid(x) uses tf.nn.sigmoid(x), which is the correct function call. Others are invalid or do not exist.Final Answer:
tf.nn.sigmoid(x) -> Option BQuick Check:
Sigmoid in TensorFlow = tf.nn.sigmoid(x) [OK]
- Using tf.sigmoid instead of tf.nn.sigmoid
- Confusing ReLU with sigmoid function
- Trying to call activation from tf.activation
import tensorflow as tf x = tf.constant([-1.0, 0.0, 1.0, 2.0]) output = tf.nn.relu(x) print(output.numpy())
Solution
Step 1: Understand ReLU behavior on input tensor
ReLU outputs zero for negative inputs and passes positive inputs unchanged.Step 2: Apply ReLU to each element in x
-1.0 becomes 0.0, 0.0 stays 0.0, 1.0 stays 1.0, 2.0 stays 2.0.Final Answer:
[0. 0. 1. 2.] -> Option DQuick Check:
ReLU([-1,0,1,2]) = [0,0,1,2] [OK]
- Expecting negative values to remain
- Confusing ReLU with sigmoid output
- Assuming output is all ones
import tensorflow as tf x = tf.constant([2.0, 1.0, 0.1]) output = tf.nn.softmax(x, axis=1) print(output.numpy())
Solution
Step 1: Check the shape of input tensor x
x is a 1D tensor with shape (3,), so valid axis values are 0 or -1.Step 2: Understand axis parameter in softmax
Axis=1 is invalid for 1D tensor because axis 1 does not exist; axis must be 0 or -1.Final Answer:
The axis parameter should be 0 or -1 for this tensor -> Option AQuick Check:
Softmax axis for 1D tensor = 0 or -1 [OK]
- Using axis=1 on 1D tensor causes error
- Thinking softmax can't apply to 1D tensors
- Omitting axis but expecting default to work
Solution
Step 1: Understand output layer needs for multi-class classification
Output layer must produce probabilities that sum to 1 across all classes.Step 2: Identify activation function that outputs class probabilities
Softmax converts raw scores into probabilities summing to 1, perfect for multi-class outputs.Final Answer:
Softmax -> Option CQuick Check:
Multi-class output uses Softmax = B [OK]
- Using sigmoid for multi-class instead of softmax
- Choosing ReLU which doesn't output probabilities
- Confusing tanh with probability output
