To enhance an RNN text classifier's ability to capture long-range dependencies and reduce vanishing gradient issues, which architecture change is most effective?

hard📝 Application Q9 of 15

NLP - Sequence Models for NLP

AIncrease the number of SimpleRNN units without changing architecture

BReplace SimpleRNN with an LSTM or GRU layer

CAdd more Dense layers after the SimpleRNN layer

DUse a smaller batch size during training

Step-by-Step Solution

Solution:

Step 1: Understand vanishing gradient problem
SimpleRNNs struggle with long sequences due to vanishing gradients.
Step 2: Identify architectures that mitigate this
LSTM and GRU units have gating mechanisms to preserve gradients over longer sequences.
Step 3: Evaluate other options
Increasing units or adding Dense layers doesn't solve gradient issues; batch size affects training but not vanishing gradients directly.
Final Answer:
Replace SimpleRNN with an LSTM or GRU layer -> Option B
Quick Check:
LSTM/GRU handle long dependencies better [OK]

Quick Trick: Use LSTM/GRU to fix vanishing gradients [OK]

Common Mistakes:

MISTAKES

Thinking more units fix vanishing gradients
Adding Dense layers improves sequence memory
Believing batch size affects gradient flow

Master "Sequence Models for NLP" in NLP

9 interactive learning modes - each teaches the same concept differently

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions

More NLP Quizzes

To enhance an RNN text classifier's ability to capture long-range dependencies and reduce vanishing gradient issues, which architecture change is most effective?

Step 1: Understand vanishing gradient problem

Step 2: Identify architectures that mitigate this

Step 3: Evaluate other options

Final Answer:

Quick Check:

Want More Practice?