0
0
NLPml~20 mins

Lexicon-based approaches (VADER) in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
VADER Sentiment Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
VADER Sentiment Polarity Scores Output
What is the output of the following Python code using VADER sentiment analyzer?
NLP
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
sentence = "I love sunny days but hate the rain."
scores = analyzer.polarity_scores(sentence)
print(scores)
A{"neg": 0.292, "neu": 0.458, "pos": 0.25, "compound": -0.0516}
B{"neg": 0.25, "neu": 0.5, "pos": 0.25, "compound": 0.0}
C{"neg": 0.0, "neu": 0.5, "pos": 0.5, "compound": 0.6249}
D{"neg": 0.292, "neu": 0.458, "pos": 0.25, "compound": 0.6249}
Attempts:
2 left
💡 Hint
Remember that VADER returns four scores: negative, neutral, positive, and compound.
🧠 Conceptual
intermediate
1:30remaining
Understanding VADER's Compound Score Range
What is the range of the compound score produced by VADER sentiment analysis, and what does it represent?
ARange is from 0 to 100; it represents the percentage of positive words in the text.
BRange is from 0 to 1; it represents the probability of positive sentiment.
CRange is from -1 to 0; it represents the intensity of negative sentiment only.
DRange is from -1 to 1; it represents the overall sentiment from most negative to most positive.
Attempts:
2 left
💡 Hint
Think about how VADER summarizes sentiment in one number.
Metrics
advanced
2:00remaining
Evaluating VADER Sentiment Classification Accuracy
You have a dataset of 1000 sentences labeled as positive or negative. Using VADER's compound score with a threshold of 0.05 for positive and -0.05 for negative, which metric best measures how well VADER classifies sentiment?
AAccuracy, because it measures the proportion of correctly classified sentences.
BSilhouette Score, because it measures cluster separation.
CMean Squared Error, because it measures the difference between predicted and true labels.
DPerplexity, because it measures language model uncertainty.
Attempts:
2 left
💡 Hint
Consider the task is classification, not regression or clustering.
🔧 Debug
advanced
1:30remaining
Identifying Error in VADER Usage Code
What error will the following code raise when run? from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer analyzer = SentimentIntensityAnalyzer() scores = analyzer.polarity_scores(12345) print(scores)
AValueError: invalid literal for int() with base 10
BTypeError: expected string or bytes-like object
CNo error; outputs sentiment scores
DAttributeError: 'int' object has no attribute 'lower'
Attempts:
2 left
💡 Hint
VADER preprocesses input by converting it to a string first.
Model Choice
expert
2:30remaining
Choosing the Best Sentiment Analysis Approach for Social Media Text
You want to analyze sentiment of short social media posts with slang, emojis, and informal language. Which approach is best suited?
ATraditional bag-of-words model with logistic regression, because it captures word frequency.
BLexicon-based approach using VADER, because it is tuned for social media text and handles emojis and slang well.
CTopic modeling with LDA, because it finds themes in text.
DRule-based sentiment analysis using fixed dictionaries without handling slang.
Attempts:
2 left
💡 Hint
Consider which method is designed for informal, short texts with emojis.