Prompt Engineering / GenAIml~12 mins

Top-p and top-k sampling in Prompt Engineering / GenAI - Model Pipeline Trace

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Model Pipeline - Top-p and top-k sampling

This pipeline shows how a language model generates text using two popular sampling methods: top-k and top-p. These methods help the model pick the next word by focusing on the most likely options, making the output more natural and creative.

Data Flow - 6 Stages

1Input prompt

1 sequence of tokens→User provides a starting text prompt→1 sequence of tokens

"The weather today is"

↓

2Model prediction

1 sequence of tokens→Model predicts probabilities for next token over vocabulary→1 probability distribution over 50,000 tokens

{"the": 0.1, "sunny": 0.05, "rainy": 0.03, "cloudy": 0.02, "windy": 0.01, ...}

↓

3Top-k filtering

1 probability distribution over 50,000 tokens→Keep only top k=5 tokens with highest probabilities, set others to zero→1 filtered probability distribution over 5 tokens

{"the": 0.1, "sunny": 0.05, "rainy": 0.03, "cloudy": 0.02, "windy": 0.01}

↓

4Top-p (nucleus) filtering

1 probability distribution over 50,000 tokens→Keep smallest set of tokens whose cumulative probability ≥ p=0.9, set others to zero→1 filtered probability distribution over ~10 tokens

{"the": 0.1, "sunny": 0.05, "rainy": 0.03, "cloudy": 0.02, "windy": 0.01, "storm": 0.01, ...}

↓

5Sampling

Filtered probability distribution→Randomly pick next token based on filtered probabilities→1 chosen token

"sunny"

↓

6Output generation

1 sequence of tokens + 1 chosen token→Append chosen token to sequence and repeat for next token→Longer sequence of tokens

"The weather today is sunny"

Training Trace - Epoch by Epoch


Epoch: 1 | Loss: 3.2  ***************
Epoch: 2 | Loss: 2.5  ***********
Epoch: 3 | Loss: 2.0  ********
Epoch: 4 | Loss: 1.7  *******
Epoch: 5 | Loss: 1.5  ******

Epoch	Loss ↓	Accuracy ↑	Observation
1	3.2	0.25	Model starts learning basic word patterns
2	2.5	0.40	Loss decreases as model predicts common words better
3	2.0	0.52	Model improves on grammar and context
4	1.7	0.60	Model starts capturing longer dependencies
5	1.5	0.65	Training converges with steady improvement

Prediction Trace - 5 Layers

Layer 1: Model prediction

Layer 2: Top-k filtering (k=5)

Layer 3: Top-p filtering (p=0.9)

Layer 4: Sampling

Layer 5: Output generation

Model Quiz - 3 Questions

Test your understanding

What does top-k sampling do to the model's predicted probabilities?

ARandomly shuffles all tokens before sampling

BKeeps only the top k tokens with highest probabilities and ignores the rest

CKeeps tokens until their cumulative probability reaches p

DAlways picks the token with highest probability

Key Insight

Top-k and top-p sampling help language models generate more natural and diverse text by limiting the next word choices to the most likely tokens. This balances creativity and coherence in generated text.

Practice

(1/5)

1. What does top-k sampling do in text generation?

easy

A. It selects the next word from the top k most likely words.

B. It selects the next word randomly from all possible words.

C. It picks words until their total probability reaches p.

D. It always picks the single most likely next word.

Top-p and top-k sampling in Prompt Engineering / GenAI - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand top-k sampling definition

Step 2: Compare with other methods

Final Answer:

Quick Check:

Solution

Step 1: Recall top-p sampling definition

Step 2: Evaluate options

Final Answer:

Quick Check:

Solution

Step 1: Calculate cumulative probabilities

Step 2: Select smallest set ≥ p=0.7

Final Answer:

Quick Check:

Solution

Step 1: Understand top-k parameter effect

Step 2: Check other options

Final Answer:

Quick Check:

Solution

Step 1: Understand creativity vs coherence tradeoff

Step 2: Combine top-k and top-p for balance

Final Answer:

Quick Check: