Challenge - 5 Problems
Context Window Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Understanding context window size
If a language model has a context window of 2048 tokens, what happens when you input a text longer than 2048 tokens?
Attempts:
2 left
💡 Hint
Think about how models handle inputs that exceed their maximum token capacity.
✗ Incorrect
Language models with fixed context windows can only consider a limited number of tokens at once. When input exceeds this limit, the model typically uses the most recent tokens within the window, ignoring earlier tokens.
❓ Predict Output
intermediate2:00remaining
Token count calculation
Given the following Python code using the Hugging Face tokenizer, what is the output of the print statement?
Prompt Engineering / GenAI
from transformers import GPT2Tokenizer tokenizer = GPT2Tokenizer.from_pretrained('gpt2') text = 'Hello world! This is a test.' tokens = tokenizer.encode(text) print(len(tokens))
Attempts:
2 left
💡 Hint
Count how the tokenizer splits the sentence into tokens.
✗ Incorrect
The GPT-2 tokenizer splits 'Hello world! This is a test.' into 7 tokens: ['Hello', 'Ġworld', '!', 'ĠThis', 'Ġis', 'Ġa', 'Ġtest', '.'] is incorrect because the final '.' is part of the last token 'Ġtest.'.
❓ Hyperparameter
advanced2:00remaining
Choosing context window size for training
When training a transformer model, increasing the context window size from 512 to 2048 tokens will most likely:
Attempts:
2 left
💡 Hint
Think about how longer sequences affect computation in transformers.
✗ Incorrect
Longer context windows mean the model processes more tokens at once, increasing memory and computation time, which slows training and requires more resources.
❓ Metrics
advanced2:00remaining
Effect of token limits on model evaluation
If a model's context window is 1024 tokens but the evaluation dataset contains samples of 1500 tokens, what is the likely effect on the evaluation metrics?
Attempts:
2 left
💡 Hint
Consider how truncating input affects model understanding.
✗ Incorrect
When inputs exceed the context window, the model only sees part of the input, which can reduce its ability to make accurate predictions, hurting evaluation metrics.
🔧 Debug
expert3:00remaining
Diagnosing token limit errors in generation
You use a language model with a 2048 token limit. Your code generates text by appending new tokens to the input prompt repeatedly. After some iterations, generation fails with a token limit error. What is the best way to fix this?
Attempts:
2 left
💡 Hint
Think about how to keep the input size manageable during generation.
✗ Incorrect
Since the model cannot process more than 2048 tokens, removing the oldest tokens keeps the input within limits while preserving recent context for generation.