What is the main reason to use activation functions in neural networks?
Think about what would happen if the network was just a chain of linear operations.
Activation functions add non-linearity, allowing the network to learn complex patterns beyond simple linear relationships.
What is the output of the ReLU function for the input array [-2, 0, 3]?
import numpy as np inputs = np.array([-2, 0, 3]) outputs = np.maximum(0, inputs) print(outputs.tolist())
ReLU outputs zero for negative inputs and the input itself if positive.
ReLU returns 0 for negative inputs and the input value for zero or positive inputs.
Which activation function is generally preferred for hidden layers in deep neural networks to avoid vanishing gradients?
Consider which function keeps gradients strong during backpropagation.
ReLU helps avoid vanishing gradients by outputting zero or positive values, keeping gradients alive during training.
A model trained with sigmoid activation in hidden layers achieves 70% accuracy. After switching to ReLU, accuracy improves to 85%. What is the most likely reason?
Think about how activation functions affect training dynamics and gradients.
ReLU improves gradient flow and speeds up training, often leading to better accuracy compared to sigmoid in hidden layers.
What does this code output when applying the sigmoid function incorrectly?
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(x))
print(sigmoid(1))Check the formula for sigmoid and the sign inside the exponent.
The sigmoid function is defined incorrectly: it should be 1 / (1 + np.exp(-x)). For x=1, the correct output is ~0.731, but this code outputs ~0.268.