Bird
0
0

To enhance an RNN text classifier's ability to capture long-range dependencies and reduce vanishing gradient issues, which architecture change is most effective?

hard📝 Application Q9 of 15
NLP - Sequence Models for NLP
To enhance an RNN text classifier's ability to capture long-range dependencies and reduce vanishing gradient issues, which architecture change is most effective?
AIncrease the number of SimpleRNN units without changing architecture
BReplace SimpleRNN with an LSTM or GRU layer
CAdd more Dense layers after the SimpleRNN layer
DUse a smaller batch size during training
Step-by-Step Solution
Solution:
  1. Step 1: Understand vanishing gradient problem

    SimpleRNNs struggle with long sequences due to vanishing gradients.
  2. Step 2: Identify architectures that mitigate this

    LSTM and GRU units have gating mechanisms to preserve gradients over longer sequences.
  3. Step 3: Evaluate other options

    Increasing units or adding Dense layers doesn't solve gradient issues; batch size affects training but not vanishing gradients directly.
  4. Final Answer:

    Replace SimpleRNN with an LSTM or GRU layer -> Option B
  5. Quick Check:

    LSTM/GRU handle long dependencies better [OK]
Quick Trick: Use LSTM/GRU to fix vanishing gradients [OK]
Common Mistakes:
MISTAKES
  • Thinking more units fix vanishing gradients
  • Adding Dense layers improves sequence memory
  • Believing batch size affects gradient flow

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes