Temperature and sampling help control how creative or random a language model's text predictions are.
Temperature and sampling in NLP
Start learning this pattern below
Jump into concepts and practice - no test required
def sample_with_temperature(logits, temperature=1.0): import numpy as np logits = logits / temperature exp_logits = np.exp(logits - np.max(logits)) probs = exp_logits / np.sum(exp_logits) return np.random.choice(len(probs), p=probs)
Temperature is a positive number; lower than 1 makes output more focused, higher than 1 makes it more random.
Sampling means picking the next word based on probabilities, not just the highest one.
sample_with_temperature(logits, temperature=0.5)sample_with_temperature(logits, temperature=1.0)sample_with_temperature(logits, temperature=2.0)This code shows how changing temperature affects which index (word) is picked from logits representing word scores.
import numpy as np def sample_with_temperature(logits, temperature=1.0): logits = logits / temperature exp_logits = np.exp(logits - np.max(logits)) probs = exp_logits / np.sum(exp_logits) return np.random.choice(len(probs), p=probs) # Example logits for 5 possible next words logits = np.array([2.0, 1.0, 0.1, 0.5, 0.0]) print('Sampling with temperature 0.5:') for _ in range(5): idx = sample_with_temperature(logits, temperature=0.5) print(f'Chosen index: {idx}') print('\nSampling with temperature 1.0:') for _ in range(5): idx = sample_with_temperature(logits, temperature=1.0) print(f'Chosen index: {idx}') print('\nSampling with temperature 2.0:') for _ in range(5): idx = sample_with_temperature(logits, temperature=2.0) print(f'Chosen index: {idx}')
Temperature close to zero makes the model almost always pick the highest scoring word.
Sampling with temperature helps avoid boring or repetitive text.
Try different temperatures to find the best balance for your task.
Temperature controls randomness in text generation.
Lower temperature = more predictable, higher temperature = more creative.
Sampling picks words based on adjusted probabilities, not just the top choice.
Practice
Solution
Step 1: Understand temperature effect on randomness
Temperature controls how much randomness is added to the word selection process in text generation.Step 2: Relate temperature to creativity
Higher temperature increases randomness, making the output more creative and less predictable.Final Answer:
Makes the output more random and creative -> Option CQuick Check:
Higher temperature = more randomness [OK]
- Thinking higher temperature makes output more predictable
- Confusing temperature with model size
- Assuming temperature stops generation
Solution
Step 1: Recall temperature scaling formula
Temperature is applied by dividing logits by temperature before softmax to adjust randomness.Step 2: Identify correct operation
Dividing logits by temperature scales the logits correctly; multiplying or adding is incorrect.Final Answer:
probs = softmax(logits / temperature) -> Option AQuick Check:
Divide logits by temperature before softmax [OK]
- Multiplying logits by temperature instead of dividing
- Adding temperature to logits
- Subtracting temperature from logits
Solution
Step 1: Scale logits by dividing by temperature
Divide each logit by 0.5: [2.0/0.5=4.0, 1.0/0.5=2.0, 0.1/0.5=0.2]Step 2: Calculate softmax probabilities
Compute exp values: exp(4.0)=54.6, exp(2.0)=7.39, exp(0.2)=1.22; sum=63.21; probability first token = 54.6/63.21 ≈ 0.86 (approx 0.86 considering rounding)Final Answer:
About 0.86 -> Option DQuick Check:
Lower temperature sharpens distribution, first token ~0.86 [OK]
- Multiplying logits by temperature instead of dividing
- Skipping exponentiation step
- Using temperature incorrectly in softmax
scaled_logits = logits * temperature probs = softmax(scaled_logits) sampled_token = sample_from(probs)
Solution
Step 1: Identify temperature scaling mistake
The code multiplies logits by temperature, which is incorrect; it should divide logits by temperature.Step 2: Explain effect of wrong scaling
Multiplying by temperature >1 increases logits, making softmax peakier and less random, causing same token output.Final Answer:
They should divide logits by temperature, not multiply -> Option AQuick Check:
Divide logits by temperature for correct scaling [OK]
- Multiplying instead of dividing logits
- Setting temperature to zero
- Ignoring softmax step
Solution
Step 1: Understand temperature impact on creativity
Temperature ~0.7 balances randomness and predictability, avoiding too repetitive or too random output.Step 2: Choose sampling method for balance
Top-k sampling limits choices to top probable tokens, improving coherence while allowing creativity.Final Answer:
Temperature around 0.7 with top-k sampling -> Option BQuick Check:
Moderate temperature + top-k = balanced creativity [OK]
- Using very low temperature causing boring text
- Using very high temperature causing nonsense
- Ignoring sampling method effects
