0
0
PyTorchml~10 mins

Transformer decoder in PyTorch - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create a TransformerDecoderLayer with 8 attention heads.

PyTorch
import torch.nn as nn

decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=[1])
Drag options to blanks, or click blank then click option'
A2
B16
C4
D8
Attempts:
3 left
💡 Hint
Common Mistakes
Using a number of heads that does not divide d_model evenly.
Confusing nhead with number of layers.
2fill in blank
medium

Complete the code to pass the memory tensor to the TransformerDecoder.

PyTorch
import torch
import torch.nn as nn

memory = torch.rand(10, 32, 512)  # (sequence_length, batch_size, d_model)
decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8)
decoder = nn.TransformerDecoder(decoder_layer, num_layers=6)
output = decoder(tgt=torch.rand(20, 32, 512), memory=[1])
Drag options to blanks, or click blank then click option'
Amemory
Btgt
Coutput
Dsrc
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the target tensor instead of memory.
Passing an undefined variable.
3fill in blank
hard

Fix the error in the code by selecting the correct mask to prevent the decoder from attending to future tokens.

PyTorch
import torch
import torch.nn as nn

decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8)
decoder = nn.TransformerDecoder(decoder_layer, num_layers=6)
tgt = torch.rand(20, 32, 512)
memory = torch.rand(10, 32, 512)
size = tgt.size(0)
mask = torch.triu(torch.ones(size, size), diagonal=[1]).bool()
output = decoder(tgt, memory, tgt_mask=mask)
Drag options to blanks, or click blank then click option'
A0
B-1
C1
D2
Attempts:
3 left
💡 Hint
Common Mistakes
Using diagonal=0 which masks the current token as well.
Using negative diagonal which is invalid here.
4fill in blank
hard

Fill both blanks to create a TransformerDecoderLayer with dropout and ReLU activation.

PyTorch
import torch.nn as nn

decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8, dropout=[1], activation=[2])
Drag options to blanks, or click blank then click option'
A0.1
B0.5
C"relu"
D"gelu"
Attempts:
3 left
💡 Hint
Common Mistakes
Using dropout as an integer.
Passing activation without quotes.
5fill in blank
hard

Fill all three blanks to create a TransformerDecoder, pass the target and memory, and apply the correct mask.

PyTorch
import torch
import torch.nn as nn

memory = torch.rand(15, 64, 256)
tgt = torch.rand(30, 64, 256)
decoder_layer = nn.TransformerDecoderLayer(d_model=256, nhead=4)
decoder = nn.TransformerDecoder(decoder_layer, num_layers=[1])
size = tgt.size(0)
mask = torch.triu(torch.ones(size, size), diagonal=[2]).bool()
output = decoder(tgt=[3], memory=memory, tgt_mask=mask)
Drag options to blanks, or click blank then click option'
A3
B1
Ctgt
D5
Attempts:
3 left
💡 Hint
Common Mistakes
Using the wrong number of layers.
Setting mask diagonal to 0 or 2.
Passing memory as tgt.