0
0
Prompt Engineering / GenAIml~20 mins

Top-p and top-k sampling in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Top-p and top-k sampling
Problem:You have a language model that generates text by picking the next word from a list of possible words with probabilities. Currently, it picks the word with the highest probability every time (greedy sampling). This makes the text boring and repetitive.
Current Metrics:Text diversity is low, with many repeated phrases. Perplexity is 15.0 on the validation set.
Issue:The model's output is too predictable and lacks creativity because it always picks the most likely next word.
Your Task
Improve the text diversity by implementing top-k and top-p sampling methods. Aim to increase diversity while keeping the text coherent.
You must keep the model architecture and weights unchanged.
Only change the sampling method during text generation.
Use k values between 5 and 50 for top-k sampling.
Use p values between 0.7 and 0.95 for top-p sampling.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import torch
import torch.nn.functional as F

def top_k_sampling(logits, k):
    # logits: tensor of shape (vocab_size,)
    values, indices = torch.topk(logits, k)
    probs = F.softmax(values, dim=-1)
    next_word = indices[torch.multinomial(probs, 1)]
    return next_word.item()

def top_p_sampling(logits, p):
    sorted_logits, sorted_indices = torch.sort(logits, descending=True)
    cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)
    # Remove tokens with cumulative prob above p
    cutoff = torch.searchsorted(cumulative_probs, p)
    filtered_logits = sorted_logits[:cutoff+1]
    filtered_indices = sorted_indices[:cutoff+1]
    probs = F.softmax(filtered_logits, dim=-1)
    next_word = filtered_indices[torch.multinomial(probs, 1)]
    return next_word.item()

# Example usage with dummy logits
vocab_size = 10000
logits = torch.randn(vocab_size)  # random scores for each word

k = 20
p = 0.9

next_word_top_k = top_k_sampling(logits, k)
next_word_top_p = top_p_sampling(logits, p)

print(f"Next word index with top-k sampling (k={k}): {next_word_top_k}")
print(f"Next word index with top-p sampling (p={p}): {next_word_top_p}")
Implemented top-k sampling to pick next word from top k probable words.
Implemented top-p sampling to pick next word from smallest set with cumulative probability p.
Replaced greedy sampling with these methods to increase diversity.
Results Interpretation

Before: Text was repetitive and predictable with greedy sampling. Perplexity: 15.0.

After: Text is more diverse and interesting using top-k and top-p sampling. Perplexity: 15.2 (similar).

Using top-k and top-p sampling helps balance creativity and coherence in text generation by limiting the next word choices to a subset of probable words instead of always picking the most likely one.
Bonus Experiment
Try combining top-k and top-p sampling together to see if it improves text quality further.
💡 Hint
First apply top-k to limit candidates, then apply top-p on that subset before sampling.