0
0
Prompt Engineering / GenAIml~12 mins

Context window and token limits in Prompt Engineering / GenAI - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Context window and token limits

This pipeline shows how input text is processed in a generative AI model with a limited context window defined by token limits. It explains how the model handles input tokens, processes them, and generates output within these limits.

Data Flow - 5 Stages
1Input Text
1 sample x variable length textRaw text input from user1 sample x variable length text
"Hello, how are you doing today?"
2Tokenization
1 sample x variable length textConvert text into tokens (words or subwords)1 sample x 8 tokens
["Hello", ",", "how", "are", "you", "doing", "today", "?"]
3Context Window Enforcement
1 sample x 8 tokensLimit tokens to max context window size (e.g., 8 tokens)1 sample x 8 tokens
["how", "are", "you", "doing", "today", "?", "<pad>", "<pad>"]
4Model Processing
1 sample x 8 tokensProcess tokens through transformer layers1 sample x 8 tokens x 768 features
Tensor of shape (1, 8, 768) representing token embeddings
5Output Generation
1 sample x 8 tokens x 768 featuresGenerate next tokens within token limit1 sample x 5 tokens
["I", "am", "fine", ".", "<eos>"]
Training Trace - Epoch by Epoch

Loss:
2.3 |**************
1.8 |**********
1.2 |*******
0.8 |****
0.5 |**

Epochs -> 1 3 5 7 10
EpochLoss ↓Accuracy ↑Observation
12.30.15Model starts with high loss and low accuracy on token prediction
31.80.35Loss decreases as model learns token patterns
51.20.55Accuracy improves steadily with training
70.80.70Model converges with lower loss and higher accuracy
100.50.85Final epoch shows good token prediction performance
Prediction Trace - 4 Layers
Layer 1: Tokenization
Layer 2: Context Window Enforcement
Layer 3: Transformer Layers
Layer 4: Output Generation
Model Quiz - 3 Questions
Test your understanding
What happens if the input text exceeds the model's context window?
AThe model automatically increases its context window
BTokens beyond the limit are dropped or truncated
CThe model ignores the token limit and processes all tokens
DThe model returns an error and stops
Key Insight
The context window and token limits define how much text the model can consider at once. Managing these limits is crucial for efficient processing and accurate predictions in generative AI models.