You want to build a model that classifies movie reviews as positive or negative. Which model is the best choice for this binary text classification task?
Think about models that understand sequences of words and context in text.
RNNs and transformer models are designed to process sequences like text and capture context, making them suitable for sentiment analysis. CNNs for images, clustering, and regression are not appropriate for this task.
You trained a model to classify news articles into 5 categories. Which metric is best to evaluate overall model performance?
Consider a metric that measures correct predictions over total predictions for classification.
Accuracy measures the proportion of correct predictions in classification tasks. MSE is for regression, BLEU is for language generation quality, and perplexity measures language model uncertainty.
You need to identify names of people, places, and organizations in text. Which model type is most suitable for this sequence labeling task?
NER requires understanding each word's role in context within a sentence.
Transformer models with token classification heads can label each word in a sequence, making them ideal for NER. Feedforward networks lose sequence order, KNN is not suited for sequence labeling, and autoencoders are for unsupervised tasks.
You trained a text classifier using a bag-of-words model and logistic regression. The accuracy is very low on test data. What is the most likely reason?
Think about what information bag-of-words loses about the text.
Bag-of-words treats text as unordered word counts, losing context and word order, which are important for understanding meaning. Logistic regression can handle binary classification and large datasets, and overfitting usually increases training accuracy.
You are fine-tuning a pre-trained transformer model on a small labeled dataset for text classification. Which hyperparameter setting is most important to avoid overfitting?
Think about techniques that reduce overfitting when data is limited.
Small batch sizes and dropout help regularize training and reduce overfitting. High learning rates can cause unstable training. Training many epochs without stopping risks overfitting. Freezing layers can help but may limit learning capacity.