The Transformer decoder is often used in tasks like language generation or translation. Here, perplexity is a key metric. It measures how well the model predicts the next word. Lower perplexity means better predictions.
For classification tasks using Transformer decoders, accuracy, precision, and recall matter depending on the goal. For example, in text generation, accuracy of predicted tokens is important. In tasks like summarization, metrics like BLEU or ROUGE are used but are outside this scope.