Experiment - Activation functions (ReLU, sigmoid, softmax)
Problem:You have a neural network model for classifying handwritten digits (0-9) using the MNIST dataset. The model currently uses sigmoid activation in all layers.
Current Metrics:Training accuracy: 92%, Validation accuracy: 85%, Training loss: 0.25, Validation loss: 0.40
Issue:The model trains but validation accuracy is lower than training accuracy, indicating some overfitting and slow learning. Sigmoid activation causes vanishing gradients in deeper layers.