Which of the following correctly represents the formula to compute attention weights using query (Q) and key (K) vectors?

easy📝 Syntax Q12 of 15

NLP - Sequence Models for NLP

ASigmoid(Q - K)

BSoftmax(Q + K)

CReLU(Q x K)

DSoftmax(Q x K^T)

Step-by-Step Solution

Solution:

Step 1: Recall attention weight calculation
Attention weights are computed by taking the dot product of query and key vectors, then applying softmax.
Step 2: Match formula to options
Softmax(Q x K^T) shows softmax applied to Q multiplied by the transpose of K, which is correct.
Final Answer:
Softmax(Q x K^T) -> Option D
Quick Check:
Attention weights = softmax(dot product) [OK]

Quick Trick: Attention weights = softmax of query-key dot product [OK]

Common Mistakes:

MISTAKES

Master "Sequence Models for NLP" in NLP

9 interactive learning modes - each teaches the same concept differently

Want More Practice?

15+ quiz questions · All difficulty levels · Free

More NLP Quizzes