NLPml~20 mins

Sequence-to-sequence architecture in NLP - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Challenge - 5 Problems

🎖️

Seq2Seq Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

What is the main purpose of a sequence-to-sequence model?

Sequence-to-sequence models are widely used in tasks like language translation. What is their main purpose?

ATo cluster data points into groups without labels

BTo classify images into fixed categories

CTo map an input sequence to an output sequence of possibly different length

DTo reduce the dimensionality of input data

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output shape of encoder and decoder in a seq2seq model

Consider a seq2seq model with an LSTM encoder and decoder. The encoder processes input sequences of length 10 with 16 features, and the decoder outputs sequences of length 12 with 20 features. What is the shape of the encoder's final hidden state and the decoder's output?

NLP

import torch
import torch.nn as nn

encoder = nn.LSTM(input_size=16, hidden_size=32, batch_first=True)
decoder = nn.LSTM(input_size=20, hidden_size=32, batch_first=True)

inputs = torch.randn(5, 10, 16)  # batch_size=5
encoder_outputs, (h_n, c_n) = encoder(inputs)

decoder_inputs = torch.randn(5, 12, 20)
decoder_outputs, _ = decoder(decoder_inputs, (h_n, c_n))

print(h_n.shape, decoder_outputs.shape)

Atorch.Size([5, 32]) torch.Size([5, 12, 20])

Btorch.Size([5, 1, 32]) torch.Size([12, 5, 32])

Ctorch.Size([1, 5, 16]) torch.Size([5, 10, 32])

Dtorch.Size([1, 5, 32]) torch.Size([5, 12, 32])

Attempts:

2 left

❓ Hyperparameter

advanced

1:30remaining

Choosing the right hidden size for seq2seq LSTM

You are training a sequence-to-sequence model for machine translation. Which hidden size choice is most likely to improve model capacity without causing excessive overfitting?

AIncrease hidden size from 128 to 512 with dropout and early stopping

BDecrease hidden size from 128 to 32 to reduce overfitting

CKeep hidden size at 128 and remove dropout layers

DIncrease hidden size to 1024 without any regularization

Attempts:

2 left

❓ Metrics

advanced

1:30remaining

Evaluating seq2seq model with BLEU score

You trained a seq2seq model for text summarization. Which metric best measures how well the model output matches human summaries?

AConfusion matrix of predicted classes

BBLEU score measuring n-gram overlap between model and reference summaries

CMean Squared Error between input and output sequences

DAccuracy of predicting the next word in the input sequence

Attempts:

2 left

🔧 Debug

expert

2:30remaining

Why does this seq2seq training loop cause exploding gradients?

Consider this simplified training loop for a seq2seq model. Why might the gradients explode?

NLP

for input_seq, target_seq in dataloader:
    optimizer.zero_grad()
    output_seq = model(input_seq, target_seq)
    loss = loss_fn(output_seq.view(-1, vocab_size), target_seq.view(-1))
    loss.backward()
    optimizer.step()

ANo gradient clipping is applied, so gradients can grow too large during backpropagation

BThe loss function is incorrect and returns zero, causing no gradient updates

CThe optimizer.zero_grad() is called after loss.backward(), causing accumulation

DThe model input and target sequences have mismatched batch sizes

Attempts:

2 left

Practice

(1/5)

1. What is the main role of the encoder in a sequence-to-sequence model?

easy

A. To generate the output sequence directly

B. To read and understand the input sequence

C. To evaluate the model's accuracy

D. To preprocess the data before training

Sequence-to-sequence architecture in NLP - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand the encoder's function

Step 2: Differentiate encoder from decoder

Final Answer:

Quick Check:

Solution

Step 1: Identify decoder's role

Step 2: Eliminate incorrect options

Final Answer:

Quick Check:

Solution

Step 1: Understand input and output lengths

Step 2: Recognize decoder output length

Final Answer:

Quick Check:

Solution

Step 1: Recall training step order

Step 2: Identify correct zero_grad() placement

Final Answer:

Quick Check:

Solution

Step 1: Understand attention's purpose

Step 2: Compare with fixed vector encoding

Step 3: Eliminate incorrect options

Final Answer:

Quick Check: