Challenge - 5 Problems
Top-p and Top-k Sampling Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate1:30remaining
Understanding Top-k Sampling
In top-k sampling, what does the parameter k control when generating text from a language model?
Attempts:
2 left
💡 Hint
Think about how many tokens the model looks at before picking the next word.
✗ Incorrect
Top-k sampling limits the candidate tokens to the k most probable ones at each step. This means the model only picks from these top k tokens, ignoring the rest.
🧠 Conceptual
intermediate1:30remaining
Understanding Top-p (Nucleus) Sampling
What does the parameter p represent in top-p (nucleus) sampling?
Attempts:
2 left
💡 Hint
It relates to the total probability mass of tokens considered.
✗ Incorrect
Top-p sampling selects the smallest set of tokens whose cumulative probability exceeds p. This means it dynamically chooses tokens based on their combined probability.
❓ Predict Output
advanced2:00remaining
Output of Top-k Sampling Code Snippet
What is the output of the following Python code simulating top-k sampling probabilities?
Prompt Engineering / GenAI
import numpy as np np.random.seed(0) logits = np.array([0.1, 0.2, 0.3, 0.4, 0.5]) k = 3 # Select top-k logits indices = np.argsort(logits)[-k:] topk_probs = np.exp(logits[indices]) / np.sum(np.exp(logits[indices])) sampled_index = np.random.choice(indices, p=topk_probs) print(sampled_index)
Attempts:
2 left
💡 Hint
Check which indices are top 3 and how probabilities are computed.
✗ Incorrect
The top 3 logits are at indices 2, 3, and 4 with values 0.3, 0.4, and 0.5. After softmax, index 4 has the highest probability. With the given seed, the sampled index is 4.
❓ Metrics
advanced1:30remaining
Effect of Top-p on Diversity Metrics
If you decrease the top-p value from 0.9 to 0.5 during text generation, what is the expected effect on the diversity of generated text?
Attempts:
2 left
💡 Hint
Think about how cumulative probability threshold limits token choices.
✗ Incorrect
Lowering top-p reduces the set of tokens considered, thus reducing diversity since the model picks from fewer options.
🔧 Debug
expert2:30remaining
Identifying Error in Top-k Sampling Implementation
Consider this code snippet for top-k sampling. Which option correctly identifies the error causing incorrect sampling?
Prompt Engineering / GenAI
import numpy as np logits = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) k = 2 indices = np.argsort(logits)[:k] topk_logits = logits[indices] probs = np.exp(topk_logits) / np.sum(np.exp(topk_logits)) sampled_index = np.random.choice(indices, p=probs) print(sampled_index)
Attempts:
2 left
💡 Hint
Check how argsort is used to select top-k logits.
✗ Incorrect
Using np.argsort(logits)[:k] selects the smallest k logits, not the largest. It should use [-k:] to get the top k highest logits.