You want to combine a GRU layer with an attention mechanism for text classification. Which of the following best describes how to integrate attention with GRU outputs?

hard📝 Application Q9 of 15

NLP - Sequence Models for NLP

AUse attention only on the initial input embeddings, ignoring GRU outputs

BApply attention weights on GRU outputs across all time steps before classification

CReplace GRU gates with attention weights

DApply attention on the hidden state of the last GRU layer only

Step-by-Step Solution

Solution:

Step 1: Understand attention in sequence models
Attention assigns importance to each time step's output from GRU.
Step 2: Apply attention on all GRU outputs
Weighted sum of outputs improves focus on relevant words before classification.
Final Answer:
Apply attention weights on GRU outputs across all time steps before classification -> Option B
Quick Check:
Attention weights on all outputs = better context [OK]

Quick Trick: Attention weights highlight important GRU outputs over time [OK]

Common Mistakes:

MISTAKES

Replacing GRU gates with attention
Ignoring GRU outputs in attention
Applying attention only on last hidden state

Master "Sequence Models for NLP" in NLP

9 interactive learning modes - each teaches the same concept differently

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions

More NLP Quizzes

You want to combine a GRU layer with an attention mechanism for text classification. Which of the following best describes how to integrate attention with GRU outputs?

Step 1: Understand attention in sequence models

Step 2: Apply attention on all GRU outputs

Final Answer:

Quick Check:

Want More Practice?