Recall & Review
beginner
What is top-k sampling in language generation?
Top-k sampling picks the next word from the top k most likely words predicted by the model. It limits choices to a fixed number, making output more focused but still random.
Click to reveal answer
beginner
Explain top-p (nucleus) sampling in simple terms.
Top-p sampling chooses the smallest set of words whose combined probability is at least p (like 0.9). It adapts the number of choices based on confidence, allowing more variety when uncertain.
Click to reveal answer
intermediate
How does top-k sampling differ from top-p sampling?
Top-k always picks from a fixed number of words (k), while top-p picks from a variable number of words that together cover a probability threshold (p). Top-p adapts to the model's confidence.
Click to reveal answer
beginner
Why do we use sampling methods like top-k or top-p instead of always picking the most likely word?
Always picking the most likely word can make text boring and repetitive. Sampling adds randomness to create more natural, diverse, and interesting outputs.
Click to reveal answer
intermediate
What happens if you set top-p to 1.0 in top-p sampling?
Setting top-p to 1.0 means including all possible words (100% probability), so it becomes like sampling from the entire vocabulary, which can produce very diverse but less coherent text.
Click to reveal answer
In top-k sampling, what does 'k' represent?
✗ Incorrect
In top-k sampling, 'k' is the fixed number of most likely words considered for sampling.
What does top-p sampling use to decide which words to sample from?
✗ Incorrect
Top-p sampling selects words whose total probability adds up to the threshold p.
Which sampling method adapts the number of candidate words based on model confidence?
✗ Incorrect
Top-p sampling adapts the candidate set size based on cumulative probability, reflecting model confidence.
Why might always picking the most likely word be a bad idea for text generation?
✗ Incorrect
Always picking the most likely word leads to repetitive and less creative text.
If top-p is set very low (e.g., 0.1), what is likely to happen?
✗ Incorrect
A low top-p threshold means only the highest probability words are included, limiting diversity.
Describe how top-k and top-p sampling work and how they differ.
Think about fixed number vs probability threshold.
You got /4 concepts.
Explain why sampling methods like top-k and top-p are important in AI text generation.
Consider what happens if you always pick the top word.
You got /4 concepts.