NLP - Sequence Models for NLP
Given query vector Q = [1, 0], key vectors K1 = [1, 0], K2 = [0, 1], and value vectors V1 = [10, 0], V2 = [0, 20], what is the attention output after applying softmax on Q·K^T and multiplying by values?
15+ quiz questions · All difficulty levels · Free
Free Signup - Practice All Questions