Why does an LSTM cell use gates with sigmoid activations instead of just using tanh activations everywhere?

hard📝 Conceptual Q10 of 15

NLP - Sequence Models for NLP

ATanh activations are too slow to compute

BSigmoid gates control information flow by outputting values between 0 and 1

CSigmoid activations prevent overfitting better than tanh

DTanh activations cannot be used in recurrent networks

Step-by-Step Solution

Solution:

Step 1: Understand gate function in LSTM
Gates decide how much information to keep or discard, needing outputs between 0 and 1.
Step 2: Recognize sigmoid role
Sigmoid outputs values in [0,1], perfect for controlling flow; tanh outputs between [-1,1], unsuitable for gating.
Final Answer:
Sigmoid gates control information flow by outputting values between 0 and 1 -> Option B
Quick Check:
Sigmoid gates = control flow with 0-1 output [OK]

Quick Trick: Sigmoid gates control info flow with 0-1 output [OK]

Common Mistakes:

MISTAKES

Master "Sequence Models for NLP" in NLP

9 interactive learning modes - each teaches the same concept differently

Want More Practice?

15+ quiz questions · All difficulty levels · Free

More NLP Quizzes