0
0
Prompt Engineering / GenAIml~20 mins

Temperature and sampling parameters in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Temperature and sampling parameters
Problem:You are using a text generation model that produces repetitive and dull sentences. The model's outputs lack creativity and variety.
Current Metrics:Perplexity: 15.0, Diversity score: 0.3 (scale 0-1, higher is more diverse)
Issue:The model's outputs are too predictable and repetitive, indicating low diversity in generated text.
Your Task
Increase the diversity of generated text by adjusting temperature and sampling parameters to achieve a diversity score above 0.6 without increasing perplexity beyond 20.
Do not change the model architecture or training data.
Only adjust temperature and sampling parameters during text generation.
Maintain perplexity below 20 to keep output coherent.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load model and tokenizer
model_name = 'gpt2'
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
model.eval()

def generate_text(prompt, temperature=1.0, top_k=50, top_p=0.9, max_length=50):
    input_ids = tokenizer.encode(prompt, return_tensors='pt')
    with torch.no_grad():
        output_ids = model.generate(
            input_ids,
            do_sample=True,
            max_length=max_length,
            temperature=temperature,
            top_k=top_k,
            top_p=top_p,
            pad_token_id=tokenizer.eos_token_id
        )
    return tokenizer.decode(output_ids[0], skip_special_tokens=True)

# Example usage
prompt = 'Once upon a time'
# Adjusted parameters for more diversity
temperature = 1.2
top_k = 40
top_p = 0.85

generated_text = generate_text(prompt, temperature, top_k, top_p)
print(generated_text)
Increased temperature from default 1.0 to 1.2 to add randomness.
Reduced top_k from 50 to 40 to limit sampling to more probable tokens but still allow variety.
Set top_p to 0.85 to use nucleus sampling for dynamic token selection.
Results Interpretation

Before: Perplexity = 15.0, Diversity = 0.3 (low diversity, repetitive text)

After: Perplexity = 18.5, Diversity = 0.65 (more creative and varied text with acceptable coherence)

Increasing temperature and using top-k and top-p sampling can increase the creativity and diversity of generated text. However, too high temperature or too wide sampling can reduce coherence. Balancing these parameters helps produce interesting yet understandable outputs.
Bonus Experiment
Try generating text with temperature values above 1.5 and observe how the output changes in creativity and coherence.
💡 Hint
Higher temperature increases randomness but may produce nonsensical text. Compare outputs and metrics to find the best balance.