0
0
NLPml~20 mins

Why sequence models understand word order in NLP - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Sequence Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why do sequence models like RNNs process words in order?

Sequence models such as RNNs read words one by one in a sentence. Why is this order important?

ABecause the model ignores previous words and treats each word independently.
BBecause the model uses previous words' information to understand the current word's meaning.
CBecause the model randomly shuffles words before processing to increase variety.
DBecause the model processes all words simultaneously without any order.
Attempts:
2 left
💡 Hint

Think about how understanding a sentence depends on the words that came before.

Predict Output
intermediate
2:00remaining
Output of a simple RNN processing word indices

Given a simple RNN that processes a sequence of word indices, what is the output shape after processing a batch of 2 sequences each of length 3?

NLP
import torch
import torch.nn as nn

rnn = nn.RNN(input_size=5, hidden_size=4, batch_first=True)
inputs = torch.randn(2, 3, 5)  # batch=2, seq_len=3, input_size=5
output, hn = rnn(inputs)
print(output.shape)
Atorch.Size([2, 3, 4])
Btorch.Size([3, 2, 4])
Ctorch.Size([2, 4, 3])
Dtorch.Size([4, 3, 2])
Attempts:
2 left
💡 Hint

Remember batch_first=True means batch is the first dimension.

Model Choice
advanced
2:00remaining
Which model inherently captures word order in text?

Among these models, which one inherently understands the order of words in a sentence without additional position information?

ATransformer without positional encoding
BBag-of-Words model
CMultilayer Perceptron (MLP) on word counts
DRecurrent Neural Network (RNN)
Attempts:
2 left
💡 Hint

Think about which model processes words one after another.

Hyperparameter
advanced
2:00remaining
Effect of sequence length on RNN training

What is a common problem when training RNNs on very long sequences, and which hyperparameter adjustment can help?

AVanishing gradients; use gradient clipping to limit gradient size.
BUnderfitting; reduce hidden size to simplify the model.
COverfitting; increase learning rate to speed training.
DData leakage; shuffle sequences randomly before training.
Attempts:
2 left
💡 Hint

Think about what happens to gradients when backpropagating through many steps.

Metrics
expert
2:00remaining
Interpreting perplexity in language models

A language model has a perplexity of 50 on a test set. What does this number mean?

AThe model predicts the next word correctly 50 times in a row.
BThe model makes 50% errors predicting the next word.
COn average, the model is as uncertain as choosing among 50 equally likely next words.
DThe model's loss value is 50 after training.
Attempts:
2 left
💡 Hint

Perplexity measures how well a model predicts a sequence; lower is better.