0
0
PyTorchml~10 mins

Why attention revolutionized deep learning in PyTorch - Test Your Understanding

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create a simple attention score using dot product.

PyTorch
import torch

query = torch.tensor([[1, 0, 1]], dtype=torch.float32)
key = torch.tensor([[0, 1, 0]], dtype=torch.float32)

attention_score = torch.matmul(query, [1].T)
print(attention_score)
Drag options to blanks, or click blank then click option'
Akey
Bquery
Ctorch.tensor([[1, 1, 1]])
Dtorch.eye(3)
Attempts:
3 left
💡 Hint
Common Mistakes
Using query instead of key for the dot product.
Not transposing the key tensor before multiplication.
2fill in blank
medium

Complete the code to apply softmax to the attention scores.

PyTorch
import torch
import torch.nn.functional as F

scores = torch.tensor([[1.0, 2.0, 3.0]])
attention_weights = F.[1](scores, dim=1)
print(attention_weights)
Drag options to blanks, or click blank then click option'
Asigmoid
Brelu
Csoftmax
Dtanh
Attempts:
3 left
💡 Hint
Common Mistakes
Using sigmoid which does not normalize across the dimension.
Using relu or tanh which do not produce probability distributions.
3fill in blank
hard

Fix the error in the scaled dot-product attention calculation.

PyTorch
import torch
import math

query = torch.randn(1, 4)
key = torch.randn(1, 4)
scale = math.sqrt(query.shape[1])
scores = torch.matmul(query, key.T) / [1]
print(scores)
Drag options to blanks, or click blank then click option'
Aquery.shape[0]
Bkey.shape[1]
Ckey.shape[0]
Dquery.shape[1]
Attempts:
3 left
💡 Hint
Common Mistakes
Using batch size dimension instead of feature dimension for scaling.
Using key shape dimension incorrectly.
4fill in blank
hard

Fill both blanks to complete the attention output calculation.

PyTorch
import torch
import torch.nn.functional as F

query = torch.randn(1, 3)
key = torch.randn(1, 3)
value = torch.randn(1, 3)

scores = torch.matmul(query, key.T) / torch.sqrt(torch.tensor(query.shape[[1]], dtype=torch.float32))
weights = F.softmax(scores, dim=[2])
output = torch.matmul(weights, value)
print(output)
Drag options to blanks, or click blank then click option'
A1
B0
C2
D-1
Attempts:
3 left
💡 Hint
Common Mistakes
Using wrong shape index for scaling.
Applying softmax on wrong dimension.
5fill in blank
hard

Fill all three blanks to implement a simple attention mechanism output.

PyTorch
import torch
import torch.nn.functional as F

query = torch.randn(2, 4)
key = torch.randn(2, 4)
value = torch.randn(2, 4)

scores = torch.matmul(query, key.T) / torch.sqrt(torch.tensor(query.shape[[1]], dtype=torch.float32))
weights = F.softmax(scores, dim=[2])
attention_output = torch.matmul(weights, [3])
print(attention_output)
Drag options to blanks, or click blank then click option'
A1
B0
Cvalue
Dkey
Attempts:
3 left
💡 Hint
Common Mistakes
Using key instead of value for final multiplication.
Incorrect dimension indices for scaling or softmax.