0
0
NLPml~20 mins

Temperature and sampling in NLP - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Temperature and sampling
Problem:You have a text generation model that uses sampling with temperature to create sentences. Currently, the model uses a temperature of 1.0 and produces repetitive or dull text.
Current Metrics:Sampled text shows low diversity and repetitiveness; qualitative evaluation indicates low creativity.
Issue:The model's output lacks variety and creativity due to suboptimal temperature settings during sampling.
Your Task
Adjust the temperature parameter during sampling to increase the diversity and creativity of generated text without making it nonsensical.
Do not change the underlying language model architecture.
Only modify the temperature parameter and sampling method.
Keep the sampling code runnable and simple.
Hint 1
Hint 2
Hint 3
Solution
NLP
import numpy as np

def sample_with_temperature(logits, temperature=1.0):
    # Convert logits to probabilities with temperature
    scaled_logits = logits / temperature
    exp_logits = np.exp(scaled_logits - np.max(scaled_logits))
    probs = exp_logits / np.sum(exp_logits)
    # Sample from the probability distribution
    return np.random.choice(len(probs), p=probs)

# Example logits for a vocabulary of 5 tokens
logits = np.array([2.0, 1.0, 0.1, 0.5, 1.5])

# Sample tokens with different temperatures
for temp in [0.5, 1.0, 1.5]:
    print(f"Sampling with temperature={temp}:")
    samples = [sample_with_temperature(logits, temperature=temp) for _ in range(10)]
    print(samples)
Added a temperature parameter to scale logits before converting to probabilities.
Implemented sampling from the adjusted probability distribution.
Demonstrated sampling at temperatures 0.5, 1.0, and 1.5 to show effect on output diversity.
Results Interpretation

Before: Sampling at temperature 1.0 produced repetitive tokens with low diversity.

After: Sampling at temperature 0.5 reduced randomness, focusing on likely tokens, while temperature 1.5 increased randomness, producing more diverse but sometimes less sensible tokens.

Adjusting temperature during sampling controls randomness: lower temperature makes output more predictable, higher temperature increases creativity but risks nonsense. This helps balance diversity and coherence in text generation.
Bonus Experiment
Try implementing top-k sampling combined with temperature to further control output diversity.
💡 Hint
Limit sampling to the top k tokens with highest probabilities after applying temperature scaling, then sample from this smaller set.