Recall & Review
beginner
What is positional encoding in the context of machine learning models like Transformers?
Positional encoding is a way to add information about the order or position of elements in a sequence, so the model knows where each element is located since models like Transformers do not process data in order by default.
Click to reveal answer
beginner
Why do Transformers need positional encoding?
Transformers process all input tokens simultaneously without any inherent order. Positional encoding helps the model understand the sequence order, which is important for tasks like language understanding.
Click to reveal answer
intermediate
How is sinusoidal positional encoding calculated?
It uses sine and cosine functions of different frequencies to create unique position vectors. For position pos and dimension i: PE(pos, 2i) = sin(pos / 10000^(2i/d_model)) and PE(pos, 2i+1) = cos(pos / 10000^(2i/d_model)). This helps the model learn relative positions.
Click to reveal answer
intermediate
What is the shape of the positional encoding tensor for a batch of sequences?
The positional encoding tensor usually has shape (sequence_length, embedding_dimension). When added to input embeddings, it matches the input shape (batch_size, sequence_length, embedding_dimension) by broadcasting over the batch.
Click to reveal answer
beginner
How does adding positional encoding affect the input embeddings in a Transformer?
Positional encoding is added element-wise to the input embeddings. This combined input carries both the token meaning and its position, allowing the Transformer to use position information during training and prediction.
Click to reveal answer
Why can't Transformers rely on sequence order without positional encoding?
✗ Incorrect
Transformers process tokens in parallel, so they need positional encoding to know the order of tokens.
Which functions are used in sinusoidal positional encoding?
✗ Incorrect
Sinusoidal positional encoding uses sine and cosine functions to encode positions.
What does the positional encoding vector depend on?
✗ Incorrect
Positional encoding depends on the position index and the embedding dimension to create unique vectors.
How is positional encoding combined with input embeddings?
✗ Incorrect
Positional encoding is added element-wise to input embeddings to include position information.
What is the main benefit of sinusoidal positional encoding?
✗ Incorrect
Sinusoidal encoding helps the model understand relative positions between tokens.
Explain in your own words why positional encoding is important for Transformer models.
Think about how Transformers see all words at once.
You got /4 concepts.
Describe how sinusoidal positional encoding is calculated and why sine and cosine functions are used.
Focus on the math functions and their role in encoding position.
You got /4 concepts.