Top-k sampling limits the candidate tokens to the k most probable ones at each step. This means the model only picks from these top k tokens, ignoring the rest.
Top-p sampling selects the smallest set of tokens whose cumulative probability exceeds p. This means it dynamically chooses tokens based on their combined probability.
import numpy as np np.random.seed(0) logits = np.array([0.1, 0.2, 0.3, 0.4, 0.5]) k = 3 # Select top-k logits indices = np.argsort(logits)[-k:] topk_probs = np.exp(logits[indices]) / np.sum(np.exp(logits[indices])) sampled_index = np.random.choice(indices, p=topk_probs) print(sampled_index)
The top 3 logits are at indices 2, 3, and 4 with values 0.3, 0.4, and 0.5. After softmax, index 4 has the highest probability. With the given seed, the sampled index is 4.
Lowering top-p reduces the set of tokens considered, thus reducing diversity since the model picks from fewer options.
import numpy as np logits = np.array([1.0, 2.0, 3.0, 4.0, 5.0]) k = 2 indices = np.argsort(logits)[:k] topk_logits = logits[indices] probs = np.exp(topk_logits) / np.sum(np.exp(topk_logits)) sampled_index = np.random.choice(indices, p=probs) print(sampled_index)
Using np.argsort(logits)[:k] selects the smallest k logits, not the largest. It should use [-k:] to get the top k highest logits.
