In natural language generation, the temperature parameter controls randomness in sampling. What happens to the output distribution when the temperature is set very high (e.g., 10)?
Think about how raising temperature affects the sharpness of probabilities.
High temperature flattens the probability distribution, making all options closer to equal chance, increasing randomness.
Given the logits array and temperature, what is the output probability distribution after applying softmax with temperature?
import numpy as np def softmax_with_temperature(logits, temperature): scaled_logits = logits / temperature exp_logits = np.exp(scaled_logits - np.max(scaled_logits)) return exp_logits / exp_logits.sum() logits = np.array([2.0, 1.0, 0.1]) temperature = 1.0 probs = softmax_with_temperature(logits, temperature) print(probs.round(3))
Lower temperature sharpens the distribution, increasing the highest probability.
Lower temperature sharpens the distribution, increasing the highest probability. The output matches option D.
You want to generate creative and diverse text using a language model. Which sampling strategy and temperature setting is best?
Consider how top-k limits choices and temperature controls randomness.
Top-k sampling with moderate temperature balances creativity and coherence by sampling from the top probable words with some randomness.
During training a text generation model, you observe that increasing temperature from 0.7 to 1.5 changes output diversity. What is the expected effect?
Think about how temperature affects randomness in sampling.
Higher temperature increases randomness, leading to more diverse and less repetitive outputs.
You generate text samples from a language model at different temperatures and compute perplexity on a fixed test set. Which temperature setting is likely to yield the lowest perplexity?
Lower perplexity means the model predicts test data better; consider how temperature affects prediction confidence.
Lower temperature sharpens the distribution, making predictions closer to the most likely tokens, reducing perplexity.