For text tasks like sentiment or spam detection, accuracy shows overall correct guesses. But because text data can be unbalanced, precision and recall are key. Precision tells us how many predicted positives are truly positive, helping avoid false alarms. Recall shows how many real positives the model finds, important to catch all relevant cases. The F1 score balances precision and recall, giving a clear view of model quality.
GRU for text in NLP - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
Actual \ Predicted | Positive | Negative
-------------------|----------|---------
Positive | 80 | 20
Negative | 10 | 90
Here, TP=80, FN=20, FP=10, TN=90. Total samples = 200.
Precision = 80 / (80 + 10) = 0.89
Recall = 80 / (80 + 20) = 0.80
F1 Score = 2 * (0.89 * 0.80) / (0.89 + 0.80) ≈ 0.84
Imagine a spam filter using a GRU model:
- High Precision: Few good emails are wrongly marked as spam. Users don't miss important messages.
- High Recall: Most spam emails are caught, but some good emails might be flagged wrongly.
Depending on what matters more (user trust or spam catching), you adjust the model threshold to favor precision or recall.
Good: Accuracy > 85%, Precision and Recall both above 80%, F1 score balanced near 0.8 or higher.
Bad: Accuracy high but recall very low (missing many positives), or precision very low (many false alarms). For example, 90% accuracy but 30% recall means many real positives are missed.
- Accuracy Paradox: High accuracy can be misleading if classes are imbalanced (e.g., 95% accuracy but model ignores rare positive class).
- Data Leakage: If test data leaks into training, metrics look unrealistically good.
- Overfitting Indicators: Very high training accuracy but low test accuracy means model memorizes text instead of learning patterns.
Your GRU text model has 98% accuracy but only 12% recall on the positive class (e.g., spam). Is it good for production?
Answer: No. Despite high accuracy, the model misses most positive cases. This means many spam emails go undetected, which is bad for user experience. You should improve recall before using it in production.
Practice
Solution
Step 1: Understand GRU's role in memory
GRU units are designed to keep important information from previous steps and forget irrelevant data, helping with sequence tasks like text.Step 2: Compare options to GRU function
Only It helps the model remember important information over time while ignoring less important details. correctly describes this memory feature; others describe unrelated or incorrect functions.Final Answer:
It helps the model remember important information over time while ignoring less important details. -> Option AQuick Check:
GRU memory feature = A [OK]
- Thinking GRU changes input size
- Confusing GRU with data preprocessing
- Assuming GRU outputs images
Solution
Step 1: Recall PyTorch GRU parameters
PyTorch GRU expects input_size first (embedding size), then hidden_size (number of features in hidden state).Step 2: Match parameters to given sizes
Embedding size is 100, hidden size is 50, so nn.GRU(input_size=100, hidden_size=50) is correct.Final Answer:
nn.GRU(input_size=100, hidden_size=50) -> Option CQuick Check:
input_size=100, hidden_size=50 = B [OK]
- Swapping input_size and hidden_size
- Using positional args incorrectly
- Omitting required parameters
import torch import torch.nn as nn gru = nn.GRU(input_size=10, hidden_size=20, batch_first=True) input = torch.randn(5, 7, 10) # batch=5, seq_len=7, input_size=10 output, hidden = gru(input) print(output.shape)
Solution
Step 1: Understand GRU output shape with batch_first=true
Output shape is (batch_size, sequence_length, hidden_size) when batch_first=true.Step 2: Match given input sizes
Input batch=5, seq_len=7, hidden_size=20, so output shape is (5, 7, 20).Final Answer:
(5, 7, 20) -> Option BQuick Check:
Output shape = (batch, seq_len, hidden_size) = A [OK]
- Confusing batch and sequence dimensions
- Ignoring batch_first=true effect
- Assuming output shape equals input shape
gru = nn.GRU(input_size=50, hidden_size=100) input = torch.randn(32, 10, 100) # batch=32, seq_len=10, input_size=100 output, hidden = gru(input)What is the likely cause of the error?
Solution
Step 1: Check GRU input_size vs input tensor last dimension
GRU expects input_size=50, but input tensor last dimension is 100, causing mismatch.Step 2: Understand tensor shape requirements
GRU input shape should be (batch, seq_len, input_size). Here input_size dimension must match GRU's input_size parameter.Final Answer:
Input size 100 does not match GRU input_size 50 -> Option AQuick Check:
Input size mismatch = C [OK]
- Blaming batch size for error
- Thinking sequence length is invalid
- Assuming GRU only accepts 2D input
Solution
Step 1: Understand variable-length sequence handling
GRU requires fixed-length inputs or packed sequences to handle variable lengths efficiently.Step 2: Use padding and packing for variable-length inputs
Padding sequences to max length and using pack_padded_sequence lets GRU ignore padded parts during processing.Final Answer:
Pad all sequences to the same length and use pack_padded_sequence before GRU. -> Option DQuick Check:
Padding + pack_padded_sequence = D [OK]
- Truncating sequences too short loses info
- Feeding raw variable-length sequences causes errors
- Switching to CNN ignores GRU benefits
