When using a language model, why is it important to count tokens instead of characters or words?
Think about how language models break down text internally.
Language models split text into tokens, which can be whole words or parts of words. Counting tokens matches how the model processes input and output, which is essential for accurate cost and usage tracking.
Given the following Python code that simulates token counting for a model response, what is the printed output?
def count_tokens(text): # Simple token count: split by spaces return len(text.split()) input_text = "Hello world! This is a test." output_text = "Hello! This test is simple." input_tokens = count_tokens(input_text) output_tokens = count_tokens(output_text) total_tokens = input_tokens + output_tokens print(total_tokens)
Count the words in both input and output texts separately, then add.
The input text has 6 words, the output text has 5 words, so total tokens counted as words is 11.
You have a language model with a maximum token limit of 4096 tokens per request. You want to maximize the amount of information processed while minimizing cost. Which strategy best balances token usage and cost?
Consider both model constraints and cost when setting token limits.
Setting a token limit slightly below the max allows input plus output tokens to fit without errors, optimizing usage and cost.
A language model charges $0.0004 per 1,000 tokens. If a user sends a prompt of 1,200 tokens and receives a response of 800 tokens, what is the total cost for this interaction?
Add input and output tokens, then multiply by cost per 1,000 tokens.
Total tokens = 1,200 + 800 = 2,000 tokens. Cost = 2,000 / 1,000 * $0.0004 = $0.0008.
Consider this Python function intended to count tokens by splitting text on spaces. It is used to track token usage for cost calculation. What error or issue will this code cause when processing the text "Hello,world!"?
def count_tokens(text):
return len(text.split(' '))
print(count_tokens("Hello,world!"))Try running the split method on a string with punctuation and spaces.
Using split(' ') splits only on single spaces, so if the string has multiple spaces or tabs, it may not split correctly. For "Hello,world!" it returns 1 because the string has no exact single space but a comma and space combined.
