Bird
0
0

Why does an LSTM cell use gates with sigmoid activations instead of just using tanh activations everywhere?

hard📝 Conceptual Q10 of 15
NLP - Sequence Models for NLP
Why does an LSTM cell use gates with sigmoid activations instead of just using tanh activations everywhere?
ATanh activations are too slow to compute
BSigmoid gates control information flow by outputting values between 0 and 1
CSigmoid activations prevent overfitting better than tanh
DTanh activations cannot be used in recurrent networks
Step-by-Step Solution
Solution:
  1. Step 1: Understand gate function in LSTM

    Gates decide how much information to keep or discard, needing outputs between 0 and 1.
  2. Step 2: Recognize sigmoid role

    Sigmoid outputs values in [0,1], perfect for controlling flow; tanh outputs between [-1,1], unsuitable for gating.
  3. Final Answer:

    Sigmoid gates control information flow by outputting values between 0 and 1 -> Option B
  4. Quick Check:

    Sigmoid gates = control flow with 0-1 output [OK]
Quick Trick: Sigmoid gates control info flow with 0-1 output [OK]
Common Mistakes:
MISTAKES
  • Thinking tanh is used for gates
  • Believing sigmoid is slower than tanh
  • Assuming tanh cannot be used in RNNs

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes