Challenge - 5 Problems
Positional Encoding Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate1:30remaining
Output of Positional Encoding Tensor Shape
What is the shape of the positional encoding tensor generated by the following PyTorch code snippet?
PyTorch
import torch import math def positional_encoding(seq_len, d_model): pe = torch.zeros(seq_len, d_model) position = torch.arange(0, seq_len, dtype=torch.float).unsqueeze(1) div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model)) pe[:, 0::2] = torch.sin(position * div_term) pe[:, 1::2] = torch.cos(position * div_term) return pe pe_tensor = positional_encoding(50, 512) print(pe_tensor.shape)
Attempts:
2 left
💡 Hint
Think about how the positional encoding is created for each position and each dimension.
✗ Incorrect
The positional encoding tensor has shape (sequence length, model dimension). Here, sequence length is 50 and model dimension is 512, so the shape is (50, 512).
🧠 Conceptual
intermediate1:30remaining
Purpose of Positional Encoding in Transformers
Why do transformer models use positional encoding?
Attempts:
2 left
💡 Hint
Transformers process all tokens simultaneously without sequence order. How do they know token positions?
✗ Incorrect
Transformers do not have recurrence or convolution, so they cannot inherently understand token order. Positional encoding adds unique position information to each token embedding.
❓ Hyperparameter
advanced2:00remaining
Effect of Changing Model Dimension in Positional Encoding
If you increase the model dimension (d_model) in the positional encoding function, what is the expected effect on the positional encoding vectors?
Attempts:
2 left
💡 Hint
Consider what d_model controls in the positional encoding tensor shape.
✗ Incorrect
Increasing d_model increases the number of dimensions in the positional encoding, allowing the model to encode position information with more detail.
🔧 Debug
advanced2:00remaining
Identify the Error in Positional Encoding Code
What error will the following PyTorch code raise when executed?
PyTorch
import torch import math def positional_encoding(seq_len, d_model): pe = torch.zeros(seq_len, d_model) position = torch.arange(0, seq_len).float().unsqueeze(1) div_term = torch.exp(torch.arange(0, d_model, 2) * (-math.log(10000.0) / d_model)) pe[:, 0::2] = torch.sin(position * div_term) pe[:, 1::2] = torch.cos(position * div_term) return pe pe_tensor = positional_encoding(50, 512)
Attempts:
2 left
💡 Hint
Check the data types and shapes of tensors used in multiplication and assignment.
✗ Incorrect
torch.arange returns a tensor of integers by default. Multiplying integer tensor with float tensor without casting causes shape mismatch in assignment.
❓ Model Choice
expert2:30remaining
Choosing Positional Encoding Type for a Transformer Model
You want to build a transformer model for a task with very long sequences (e.g., 10,000 tokens). Which positional encoding approach is best to handle this scenario?
Attempts:
2 left
💡 Hint
Consider which encoding can extrapolate beyond training sequence lengths.
✗ Incorrect
Sinusoidal positional encoding uses fixed mathematical functions that can generalize to any sequence length, unlike learned embeddings limited by training length.