Recall & Review
beginner
What is the main purpose of the encoder in an encoder-decoder model?
The encoder reads the input data and converts it into a fixed-size representation (a context vector) that summarizes the input information for the decoder to use.
Click to reveal answer
beginner
Why do we use attention in encoder-decoder models?
Attention helps the decoder focus on different parts of the input sequence at each step, instead of relying on a single fixed context vector. This improves performance, especially for long sequences.
Click to reveal answer
intermediate
Describe how the attention mechanism works in simple terms.
At each step, the decoder looks at all encoder outputs and assigns weights (attention scores) to them. These weights show how important each input part is for generating the current output word.
Click to reveal answer
intermediate
What is the difference between the context vector in a basic encoder-decoder and one with attention?
In a basic model, the context vector is fixed and the same for all output steps. With attention, the context vector changes at each step, computed as a weighted sum of encoder outputs based on attention scores.
Click to reveal answer
intermediate
How does attention improve translation quality in machine translation tasks?
Attention allows the model to align output words with relevant input words dynamically, helping it handle long sentences and complex structures better than fixed context models.
Click to reveal answer
What does the encoder output in an encoder-decoder model with attention?
✗ Incorrect
The encoder outputs a sequence of hidden states that represent the input sequence, which the attention mechanism uses.
In attention, what do the attention weights represent?
✗ Incorrect
Attention weights show how much each input token should influence the current output token.
Why is a fixed context vector limiting in basic encoder-decoder models?
✗ Incorrect
A fixed context vector compresses all input information into one vector, which can lose details especially for long inputs.
Which part of the model uses attention scores to generate output?
✗ Incorrect
The decoder uses attention scores to focus on relevant encoder outputs when producing each output token.
What is a common benefit of adding attention to encoder-decoder models?
✗ Incorrect
Attention helps models better process long sequences by focusing on important parts dynamically.
Explain how the attention mechanism changes the way the decoder generates each output token compared to a basic encoder-decoder model.
Think about how the decoder decides what input information to use at each step.
You got /4 concepts.
Describe the roles of the encoder, decoder, and attention mechanism in an encoder-decoder model with attention.
Consider how these components work together to produce output.
You got /3 concepts.