Recall & Review
beginner
What does the ReLU activation function do to input values?
ReLU (Rectified Linear Unit) outputs the input directly if it is positive; otherwise, it outputs zero. It helps models learn faster by allowing only positive signals to pass.
Click to reveal answer
beginner
Describe the sigmoid activation function and its output range.
The sigmoid function squashes input values into a range between 0 and 1, making it useful for probabilities in binary classification tasks.
Click to reveal answer
intermediate
What is the purpose of the softmax activation function in neural networks?
Softmax converts a vector of raw scores into probabilities that sum to 1, often used in the output layer for multi-class classification.Click to reveal answer
intermediate
Why is ReLU preferred over sigmoid in hidden layers of deep networks?
ReLU avoids the vanishing gradient problem by not saturating for positive inputs, allowing faster and more effective training compared to sigmoid.
Click to reveal answer
intermediate
How does softmax ensure the outputs can be interpreted as probabilities?
Softmax exponentiates each input and divides by the sum of all exponentiated inputs, ensuring all outputs are positive and sum to 1.
Click to reveal answer
Which activation function outputs zero for all negative inputs?
✗ Incorrect
ReLU outputs zero for negative inputs and passes positive inputs unchanged.
What is the output range of the sigmoid activation function?
✗ Incorrect
Sigmoid squashes inputs into the range 0 to 1.
Which activation function is best suited for multi-class classification output layers?
✗ Incorrect
Softmax outputs probabilities for each class that sum to 1, ideal for multi-class classification.
Why might sigmoid activation slow down training in deep networks?
✗ Incorrect
Sigmoid saturates at extremes, causing gradients to become very small and slowing learning.
How does softmax transform its input vector?
✗ Incorrect
Softmax exponentiates each input and divides by the sum of all exponentials to produce probabilities.
Explain the differences between ReLU, sigmoid, and softmax activation functions and when to use each.
Think about output ranges and typical use cases in neural networks.
You got /6 concepts.
Describe why ReLU helps avoid the vanishing gradient problem compared to sigmoid.
Consider how gradients behave during backpropagation.
You got /4 concepts.