0
0
NLPml~10 mins

Sentiment analysis pipeline in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the necessary library for text vectorization.

NLP
from sklearn.feature_extraction.text import [1]
Drag options to blanks, or click blank then click option'
ATfidfVectorizer
BLabelEncoder
CCountVectorizer
Dtrain_test_split
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing CountVectorizer which counts word occurrences but does not weight them.
Using LabelEncoder which is for labels, not text features.
2fill in blank
medium

Complete the code to split the dataset into training and testing sets.

NLP
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=[1], random_state=42)
Drag options to blanks, or click blank then click option'
A0.1
B0.2
C0.5
D1.0
Attempts:
3 left
💡 Hint
Common Mistakes
Using 1.0 which means all data is test set.
Using 0.5 which splits data evenly but is less common.
3fill in blank
hard

Fix the error in the model training code by completing the missing method.

NLP
model = LogisticRegression()
model.[1](X_train, y_train)
Drag options to blanks, or click blank then click option'
Apredict
Bscore
Ctransform
Dfit
Attempts:
3 left
💡 Hint
Common Mistakes
Using predict before training causes errors.
Using transform is for feature transformers, not models.
4fill in blank
hard

Complete the code to create a dictionary of predictions and calculate accuracy.

NLP
predictions = model.[1](X_test)
accuracy = accuracy_score(y_test, predictions,)
Drag options to blanks, or click blank then click option'
Apredict
B)
C,
Dpredict_proba
Attempts:
3 left
💡 Hint
Common Mistakes
Using predict_proba which returns probabilities, not labels.
Missing the comma between arguments in accuracy_score.
5fill in blank
hard

Fill all three blanks to build a simple sentiment analysis pipeline.

NLP
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
    ('vectorizer', [1]()),
    ('classifier', [2]())
])
pipeline.[3](X_train, y_train)
Drag options to blanks, or click blank then click option'
ATfidfVectorizer
BLogisticRegression
Cfit
DCountVectorizer
Attempts:
3 left
💡 Hint
Common Mistakes
Using CountVectorizer instead of TfidfVectorizer which weights terms.
Calling predict instead of fit to train the pipeline.