You want to classify news articles into 5 categories based on their text content. Which model is most suitable for this multi-class text classification task?
Think about models that handle text data and can output multiple classes.
Linear regression and decision trees for regression are for continuous outputs, not categories. K-means is unsupervised clustering, not classification. CNNs can learn patterns in text sequences and are suitable for multi-class classification.
You trained a multi-class text classifier with 4 classes. After testing, you got the following predictions and true labels:
Predictions: [2, 0, 1, 3, 1, 0]
True labels: [2, 0, 0, 3, 1, 1]
What is the accuracy of the model on this test set?
Accuracy = (number of correct predictions) / (total predictions).
Correct predictions are at indices 0,1,3,4 (4 correct out of 6). Accuracy = 4/6 = 0.6667.
Given the logits output from a model for 3 classes: [2.0, 1.0, 0.1], what is the softmax output vector?
import numpy as np logits = np.array([2.0, 1.0, 0.1]) exp_logits = np.exp(logits) softmax = exp_logits / np.sum(exp_logits) print(np.round(softmax, 3))
Softmax converts logits to probabilities that sum to 1.
Calculating softmax: exp(2)=7.389, exp(1)=2.718, exp(0.1)=1.105. Sum=11.212. Probabilities: 7.389/11.212=0.659, 2.718/11.212=0.242, 1.105/11.212=0.099.
You are training a multi-class text classification model on a large dataset. Which batch size choice is likely to improve training stability and speed without using too much memory?
Consider trade-offs between memory, speed, and gradient stability.
Batch size 1 is slow and noisy. Very large batch sizes can cause memory issues and poor generalization. Using the entire dataset is often impractical. Batch size 32 balances speed, memory, and stable gradients.
You trained a multi-class text classifier with 6 classes. The training accuracy is 95%, but test accuracy is only 40%. Which issue is most likely causing this problem?
Think about why training accuracy is high but test accuracy is low.
High training accuracy but low test accuracy indicates overfitting. Underfitting would show low training accuracy. Test data having fewer classes would cause different errors but not this pattern. Mean squared error is not ideal but not the main cause here.