Bird
Raised Fist0
Prompt Engineering / GenAIml~5 mins

Context window and token limits in Prompt Engineering / GenAI - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a context window in language models?
A context window is the amount of text (tokens) a language model can look at or remember at one time to understand and generate responses.
Click to reveal answer
beginner
Why do language models have token limits?
Token limits exist because models can only process a fixed number of tokens at once due to memory and computation limits.
Click to reveal answer
intermediate
How does exceeding the token limit affect a model's output?
If input text is longer than the token limit, the model may ignore or cut off the extra tokens, leading to incomplete or less accurate responses.
Click to reveal answer
beginner
What is a token in the context of language models?
A token is a piece of text like a word or part of a word that the model processes. For example, 'chat' and 'ting' might be two tokens for 'chatting'.
Click to reveal answer
intermediate
How can you manage long texts with token limits in language models?
You can split long texts into smaller parts within the token limit or summarize parts to fit the model's context window.
Click to reveal answer
What happens if a text input exceeds a model's token limit?
AThe model ignores tokens beyond the limit
BThe model processes all tokens anyway
CThe model increases its token limit automatically
DThe model crashes immediately
Which of these best describes a token?
AA single character only
BA sentence
CAn entire paragraph
DA piece of text like a word or part of a word
Why is the context window important for language models?
AIt controls the model's training speed
BIt limits how much text the model can understand at once
CIt decides the model's output language
DIt stores the model's parameters
How can you handle a text longer than the token limit?
ASplit the text into smaller parts
BIgnore the token limit
CAdd random tokens to the text
DUse only the first character of the text
What is a common reason for token limits in models?
ATo make models slower
BTo reduce model accuracy
CMemory and computation constraints
DTo limit user input length arbitrarily
Explain what a context window is and why token limits matter in language models.
Think about how much text the model can see at once and why it can't see unlimited text.
You got /3 concepts.
    Describe strategies to work with texts longer than a model's token limit.
    Consider how to prepare text so the model can handle it properly.
    You got /3 concepts.

      Practice

      (1/5)
      1. What does the context window in a language model refer to?
      easy
      A. The speed at which the model generates text
      B. The maximum amount of text the model can process at once
      C. The number of layers in the model
      D. The size of the model's vocabulary

      Solution

      1. Step 1: Understand the term 'context window'

        The context window is the chunk of text the model reads at one time.
      2. Step 2: Relate to model processing limits

        The model cannot process more text than this window size at once.
      3. Final Answer:

        The maximum amount of text the model can process at once -> Option B
      4. Quick Check:

        Context window = max text processed [OK]
      Hint: Context window means max text input size [OK]
      Common Mistakes:
      • Confusing context window with model layers
      • Thinking it relates to speed
      • Mixing it with vocabulary size
      2. Which of the following is the correct way to check if input text fits within a model's token limit in Python?
      easy
      A. if len(tokenizer.encode(text)) <= token_limit:
      B. if len(text) <= token_limit:
      C. if len(text.split()) <= token_limit:
      D. if text.length <= token_limit:

      Solution

      1. Step 1: Understand token counting

        Tokens are pieces of text, not just characters or words, so we must use the tokenizer.
      2. Step 2: Use tokenizer to encode text

        Using tokenizer.encode(text) gives the token list; its length is token count.
      3. Final Answer:

        if len(tokenizer.encode(text)) <= token_limit: -> Option A
      4. Quick Check:

        Use tokenizer.encode() to count tokens [OK]
      Hint: Use tokenizer.encode() to count tokens, not len(text) [OK]
      Common Mistakes:
      • Counting characters instead of tokens
      • Counting words by splitting text
      • Using incorrect syntax like text.length
      3. Given a model with a token limit of 10, what will be the output of this Python code snippet?
      text = "Hello world! This is AI."
      tokens = tokenizer.encode(text)
      print(len(tokens) <= 10)
      medium
      A. Error: tokenizer not defined
      B. False
      C. True
      D. 10

      Solution

      1. Step 1: Check for defined variables

        The code uses tokenizer.encode(text), but tokenizer is not defined or imported.
      2. Step 2: Trace execution

        Execution stops at tokens = tokenizer.encode(text) with NameError: name 'tokenizer' is not defined. No output is printed.
      3. Final Answer:

        Error: tokenizer not defined -> Option A
      4. Quick Check:

        Undefined tokenizer causes NameError [OK]
      Hint: Check for undefined variables like tokenizer [OK]
      Common Mistakes:
      • Assuming tokens equal words
      • Ignoring tokenizer definition
      • Confusing output with token count
      4. You have a model with a 50-token limit. This code throws an error. What is the likely cause?
      input_text = "A very long text..."  # over 100 tokens
      tokens = tokenizer.encode(input_text)
      if len(tokens) > 50:
      model.generate(tokens)
      medium
      A. The input tokens exceed the model's token limit
      B. The tokenizer.encode() function is missing parentheses
      C. The if condition should be len(tokens) < 50
      D. The model.generate() function cannot accept tokens directly

      Solution

      1. Step 1: Trace code execution flow

        Input exceeds 100 tokens, so len(tokens) > 50 is True and model.generate(tokens) executes.
      2. Step 2: Check model.generate() input type

        Usually, model.generate() expects input_ids as a tensor, not raw token list from encode(), causing TypeError.
      3. Final Answer:

        The model.generate() function cannot accept tokens directly -> Option D
      4. Quick Check:

        model.generate() needs tensor input_ids, not list [OK]
      Hint: model.generate() expects text, not token list [OK]
      Common Mistakes:
      • Assuming generate accepts tokens directly
      • Ignoring correct token limit check
      • Misreading if condition logic
      5. You want to send a long document to a language model with a 1000-token limit. Which approach best ensures the model processes the entire document without errors?
      hard
      A. Only send the first 100 tokens to reduce load
      B. Send the whole document at once and hope the model truncates it correctly
      C. Split the document into chunks of 1000 tokens or less and process each separately
      D. Increase the model's token limit by changing its architecture

      Solution

      1. Step 1: Understand token limit constraints

        The model cannot process more than 1000 tokens at once, so input must fit this limit.
      2. Step 2: Choose a method to handle long text

        Splitting the document into chunks under 1000 tokens ensures all parts are processed without errors.
      3. Step 3: Evaluate other options

        Sending all at once risks truncation; sending only 100 tokens loses data; changing architecture is not feasible.
      4. Final Answer:

        Split the document into chunks of 1000 tokens or less and process each separately -> Option C
      5. Quick Check:

        Chunking long text fits token limits [OK]
      Hint: Split long text into token-sized chunks [OK]
      Common Mistakes:
      • Sending too long text at once
      • Ignoring most of the document
      • Thinking token limit can be changed easily