0
0
NLPml~12 mins

Beam search decoding in NLP - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Beam search decoding

Beam search decoding is a method used in language models to find the most likely sequence of words by exploring multiple options at each step, keeping only the best few choices to balance quality and speed.

Data Flow - 5 Stages
1Input sequence
1 sequence x 1 tokenStart with initial token (e.g., <start>)1 sequence x 1 token
"<start>" token
2Model prediction
1 sequence x 1 tokenModel predicts probabilities for next tokens1 sequence x vocabulary size probabilities
Probabilities for next words like {"the":0.3, "a":0.2, "cat":0.1, ...}
3Beam expansion
beam width sequences x current length tokensExpand each sequence by all possible next tokens, score thembeam width * vocabulary size sequences x (current length + 1) tokens
From 3 sequences, each expanded by 5 tokens → 15 sequences
4Beam pruning
beam width * vocabulary size sequences x (current length + 1) tokensKeep top beam width sequences with highest scoresbeam width sequences x (current length + 1) tokens
Keep top 3 sequences out of 15
5Repeat until end token
beam width sequences x tokensRepeat expansion and pruning until <end> token or max lengthbeam width sequences x final length tokens
Final 3 sequences like ["<start> the cat <end>", "<start> a dog <end>", ...]
Training Trace - Epoch by Epoch
Loss
2.3 |****
1.8 |***
1.4 |**
1.1 |*
0.9 |
EpochLoss ↓Accuracy ↑Observation
12.30.25High loss and low accuracy as model starts learning
21.80.40Loss decreases and accuracy improves
31.40.55Model learns better word predictions
41.10.65Loss continues to decrease steadily
50.90.72Model converges with improved accuracy
Prediction Trace - 5 Layers
Layer 1: Initial token input
Layer 2: Model predicts next token probabilities
Layer 3: Beam expansion with beam width=2
Layer 4: Beam pruning
Layer 5: Repeat until <end> token
Model Quiz - 3 Questions
Test your understanding
What does beam width control in beam search decoding?
ANumber of sequences kept at each step
BLength of the output sequence
CSize of the vocabulary
DNumber of training epochs
Key Insight
Beam search balances exploring multiple possible sequences and focusing on the best ones, improving prediction quality compared to greedy search without excessive computation.