Bird
Raised Fist0
NLPml~20 mins

Temperature and sampling in NLP - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Temperature and Sampling Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Effect of Temperature on Sampling Distribution

In natural language generation, the temperature parameter controls randomness in sampling. What happens to the output distribution when the temperature is set very high (e.g., 10)?

AThe distribution becomes nearly uniform, making all words almost equally likely.
BThe distribution becomes very peaked, favoring the most probable word strongly.
CThe sampling ignores the model probabilities and picks words randomly from the vocabulary.
DThe model always picks the word with the highest probability deterministically.
Attempts:
2 left
💡 Hint

Think about how raising temperature affects the sharpness of probabilities.

Predict Output
intermediate
2:00remaining
Output of Sampling with Temperature

Given the logits array and temperature, what is the output probability distribution after applying softmax with temperature?

NLP
import numpy as np

def softmax_with_temperature(logits, temperature):
    scaled_logits = logits / temperature
    exp_logits = np.exp(scaled_logits - np.max(scaled_logits))
    return exp_logits / exp_logits.sum()

logits = np.array([2.0, 1.0, 0.1])
temperature = 1.0
probs = softmax_with_temperature(logits, temperature)
print(probs.round(3))
A[0.576 0.298 0.126]
B[0.500 0.300 0.200]
C[0.843 0.128 0.029]
D[0.659 0.242 0.099]
Attempts:
2 left
💡 Hint

Lower temperature sharpens the distribution, increasing the highest probability.

Model Choice
advanced
2:00remaining
Choosing Sampling Strategy for Creative Text Generation

You want to generate creative and diverse text using a language model. Which sampling strategy and temperature setting is best?

ABeam search with beam width 10 and temperature 0.5
BTop-k sampling with k=5 and temperature 1.0
CRandom sampling with temperature 5.0
DGreedy sampling with temperature 0.1
Attempts:
2 left
💡 Hint

Consider how top-k limits choices and temperature controls randomness.

Hyperparameter
advanced
2:00remaining
Impact of Temperature on Model Output Diversity

During training a text generation model, you observe that increasing temperature from 0.7 to 1.5 changes output diversity. What is the expected effect?

AOutput quality improves with fewer grammatical errors.
BOutput becomes more deterministic and repetitive.
COutput becomes more random and diverse, with less repetition.
DOutput length drastically decreases.
Attempts:
2 left
💡 Hint

Think about how temperature affects randomness in sampling.

Metrics
expert
3:00remaining
Evaluating Sampling Temperature Effects on Perplexity

You generate text samples from a language model at different temperatures and compute perplexity on a fixed test set. Which temperature setting is likely to yield the lowest perplexity?

ATemperature 0.1
BTemperature 1.0
CTemperature 2.0
DTemperature 5.0
Attempts:
2 left
💡 Hint

Lower perplexity means the model predicts test data better; consider how temperature affects prediction confidence.

Practice

(1/5)
1. What does increasing the temperature parameter in text generation usually do?
easy
A. Makes the output more predictable and repetitive
B. Stops the model from generating any text
C. Makes the output more random and creative
D. Always selects the most probable next word

Solution

  1. Step 1: Understand temperature effect on randomness

    Temperature controls how much randomness is added to the word selection process in text generation.
  2. Step 2: Relate temperature to creativity

    Higher temperature increases randomness, making the output more creative and less predictable.
  3. Final Answer:

    Makes the output more random and creative -> Option C
  4. Quick Check:

    Higher temperature = more randomness [OK]
Hint: Higher temperature means more randomness in output [OK]
Common Mistakes:
  • Thinking higher temperature makes output more predictable
  • Confusing temperature with model size
  • Assuming temperature stops generation
2. Which of the following code snippets correctly applies temperature scaling to logits before sampling in Python?
easy
A. probs = softmax(logits / temperature)
B. probs = softmax(logits * temperature)
C. probs = softmax(logits + temperature)
D. probs = softmax(logits - temperature)

Solution

  1. Step 1: Recall temperature scaling formula

    Temperature is applied by dividing logits by temperature before softmax to adjust randomness.
  2. Step 2: Identify correct operation

    Dividing logits by temperature scales the logits correctly; multiplying or adding is incorrect.
  3. Final Answer:

    probs = softmax(logits / temperature) -> Option A
  4. Quick Check:

    Divide logits by temperature before softmax [OK]
Hint: Divide logits by temperature before softmax [OK]
Common Mistakes:
  • Multiplying logits by temperature instead of dividing
  • Adding temperature to logits
  • Subtracting temperature from logits
3. Given logits = [2.0, 1.0, 0.1] and temperature = 0.5, what is the approximate probability of the first token after applying softmax with temperature scaling?
medium
A. About 0.30
B. About 0.60
C. About 0.50
D. About 0.84

Solution

  1. Step 1: Scale logits by dividing by temperature

    Divide each logit by 0.5: [2.0/0.5=4.0, 1.0/0.5=2.0, 0.1/0.5=0.2]
  2. Step 2: Calculate softmax probabilities

    Compute exp values: exp(4.0)=54.6, exp(2.0)=7.39, exp(0.2)=1.22; sum=63.21; probability first token = 54.6/63.21 ≈ 0.86 (approx 0.86 considering rounding)
  3. Final Answer:

    About 0.86 -> Option D
  4. Quick Check:

    Lower temperature sharpens distribution, first token ~0.86 [OK]
Hint: Divide logits by temperature, then softmax to find probabilities [OK]
Common Mistakes:
  • Multiplying logits by temperature instead of dividing
  • Skipping exponentiation step
  • Using temperature incorrectly in softmax
4. A developer writes this code to sample a token with temperature 1.5 but always gets the same token. What is the likely bug?
scaled_logits = logits * temperature
probs = softmax(scaled_logits)
sampled_token = sample_from(probs)
medium
A. They should divide logits by temperature, not multiply
B. They forgot to apply softmax
C. Temperature should be zero to get randomness
D. Sampling function is incorrect

Solution

  1. Step 1: Identify temperature scaling mistake

    The code multiplies logits by temperature, which is incorrect; it should divide logits by temperature.
  2. Step 2: Explain effect of wrong scaling

    Multiplying by temperature >1 increases logits, making softmax peakier and less random, causing same token output.
  3. Final Answer:

    They should divide logits by temperature, not multiply -> Option A
  4. Quick Check:

    Divide logits by temperature for correct scaling [OK]
Hint: Divide, don't multiply logits by temperature [OK]
Common Mistakes:
  • Multiplying instead of dividing logits
  • Setting temperature to zero
  • Ignoring softmax step
5. You want to generate text that balances creativity and coherence. Which temperature value and sampling strategy combination is best?
hard
A. Temperature 0.1 with greedy sampling
B. Temperature around 0.7 with top-k sampling
C. Temperature 2.0 with random sampling
D. Temperature 1.5 with no sampling (always pick max)

Solution

  1. Step 1: Understand temperature impact on creativity

    Temperature ~0.7 balances randomness and predictability, avoiding too repetitive or too random output.
  2. Step 2: Choose sampling method for balance

    Top-k sampling limits choices to top probable tokens, improving coherence while allowing creativity.
  3. Final Answer:

    Temperature around 0.7 with top-k sampling -> Option B
  4. Quick Check:

    Moderate temperature + top-k = balanced creativity [OK]
Hint: Use moderate temperature and top-k for balanced text [OK]
Common Mistakes:
  • Using very low temperature causing boring text
  • Using very high temperature causing nonsense
  • Ignoring sampling method effects