0
0
NLPml~20 mins

Sentiment analysis pipeline in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Sentiment Analysis Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
What is the main purpose of tokenization in a sentiment analysis pipeline?

In a sentiment analysis pipeline, why do we perform tokenization on the input text?

ATo train the model on labeled data
BTo convert sentiment scores into numerical values
CTo split the text into smaller units like words or subwords for easier processing
DTo remove stop words from the text
Attempts:
2 left
💡 Hint

Think about how a computer understands text before analyzing sentiment.

Predict Output
intermediate
2:00remaining
What is the output of this preprocessing code snippet?

Given the following Python code for preprocessing text in a sentiment analysis pipeline, what is the output?

NLP
import re
text = "I love this product! It's amazing."
cleaned = re.sub(r'[^a-zA-Z ]', '', text).lower().split()
print(cleaned)
A['I', 'love', 'this', 'product', 'its', 'amazing']
B['I', 'love', 'this', 'product', 'It', 's', 'amazing']
C['i', 'love', 'this', 'product', 'it', 's', 'amazing']
D['i', 'love', 'this', 'product', 'its', 'amazing']
Attempts:
2 left
💡 Hint

Look at how punctuation is removed and text is converted to lowercase.

Model Choice
advanced
2:00remaining
Which model architecture is best suited for capturing context in sentiment analysis?

You want to build a sentiment analysis model that understands the context of words in a sentence. Which model architecture is most suitable?

AA recurrent neural network (RNN) like LSTM or GRU
BA linear regression model on raw text
CA simple bag-of-words model with logistic regression
DA k-nearest neighbors model using word counts
Attempts:
2 left
💡 Hint

Think about models that can remember previous words to understand meaning.

Metrics
advanced
2:00remaining
Which metric is most appropriate to evaluate a sentiment analysis model on imbalanced data?

Your sentiment dataset has many more positive reviews than negative ones. Which evaluation metric should you prioritize?

AF1-score
BRecall
CAccuracy
DPrecision
Attempts:
2 left
💡 Hint

Consider a metric that balances precision and recall.

🔧 Debug
expert
3:00remaining
Why does this sentiment analysis model always predict the same class?

Here is a snippet of a sentiment analysis model training code. The model always predicts the same sentiment class regardless of input. What is the most likely cause?

NLP
import torch
import torch.nn as nn

class SimpleSentimentModel(nn.Module):
    def __init__(self, vocab_size, embed_dim, num_classes):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim)
        self.fc = nn.Linear(embed_dim, num_classes)

    def forward(self, x):
        embedded = self.embedding(x)  # shape: (batch_size, seq_len, embed_dim)
        pooled = embedded.mean(dim=1)  # average over seq_len
        out = self.fc(pooled)
        return out

model = SimpleSentimentModel(vocab_size=1000, embed_dim=50, num_classes=2)

# Training loop omitted for brevity

# After training, model always predicts class 0.
AThe model's output layer lacks a softmax or sigmoid activation, so predictions are not probabilities
BThe training data is imbalanced and the model learned to predict the majority class
CThe embedding layer is not updating because requires_grad is False
DThe pooling operation averages embeddings, losing important sequence information
Attempts:
2 left
💡 Hint

Consider the effect of unbalanced classes on model predictions.