Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the main idea behind the Naive Bayes algorithm in text classification?
Naive Bayes assumes that the presence of each word in a text is independent of the others and uses Bayes' theorem to calculate the probability that the text belongs to a certain category.
Click to reveal answer
beginner
Why is Naive Bayes called 'naive'?
Because it assumes that all features (words) are independent of each other, which is a simplification that is often not true in real language but still works well in practice.
Click to reveal answer
intermediate
What is the role of prior probability in Naive Bayes for text?
The prior probability represents how common each category is before seeing the text, helping the model to balance predictions based on category frequency.
Click to reveal answer
intermediate
How does Naive Bayes handle words that do not appear in the training data for a category?
It uses smoothing techniques like Laplace smoothing to assign a small non-zero probability to unseen words, preventing zero probability issues.
Click to reveal answer
beginner
What metric is commonly used to evaluate the performance of a Naive Bayes text classifier?
Accuracy is commonly used, which measures the percentage of correctly classified texts out of all texts tested.
Click to reveal answer
What assumption does Naive Bayes make about words in a text?
AWords always appear in pairs
BWords depend on their position
CWords are independent of each other
DWords have no effect on classification
✗ Incorrect
Naive Bayes assumes that each word's presence is independent of others, simplifying calculations.
What does Laplace smoothing help with in Naive Bayes?
ARemoving stop words
BIncreasing model complexity
CReducing training time
DHandling unseen words in categories
✗ Incorrect
Laplace smoothing assigns small probabilities to words not seen in training to avoid zero probabilities.
Which formula is central to Naive Bayes classification?
ABayes' theorem
BPythagorean theorem
CEuler's formula
DNewton's law
✗ Incorrect
Naive Bayes uses Bayes' theorem to calculate the probability of categories given the text.
In text classification, what does the 'prior' represent?
AThe initial probability of each category
BThe length of the text
CThe number of words in the text
DThe order of words
✗ Incorrect
The prior is the probability of each category before considering the text's words.
Which metric best shows how well a Naive Bayes text classifier works?
ANumber of features
BAccuracy
CTraining time
DWord count
✗ Incorrect
Accuracy measures the percentage of correct predictions, showing model performance.
Explain how Naive Bayes uses word probabilities to classify a text.
Think about how the model combines word chances and category chances.
You got /5 concepts.
Describe why smoothing is important in Naive Bayes for text classification.
Consider what happens if a word never appeared in training for a category.
You got /4 concepts.
Practice
(1/5)
1. What is the main assumption behind the Naive Bayes algorithm when used for text classification?
easy
A. Words always appear in a fixed order
B. Words in a document are independent of each other given the class label
C. All documents have the same length
D. The frequency of words does not affect classification
Solution
Step 1: Understand Naive Bayes assumption
Naive Bayes assumes that each feature (word) is independent of others given the class label.
Step 2: Relate assumption to text classification
This means the presence or absence of one word does not affect another word's probability in the same document for classification.
Final Answer:
Words in a document are independent of each other given the class label -> Option B
Quick Check:
Naive Bayes = word independence assumption [OK]
Hint: Naive Bayes treats words as independent features [OK]
Common Mistakes:
Thinking word order matters
Assuming word frequency is ignored
Believing documents must be same length
2. Which of the following is the correct way to calculate the probability of a document belonging to a class using Naive Bayes?
easy
A. P(class) / \sum_{word} P(word|class)
B. P(class) + \sum_{word} P(word|class)
C. P(class) * \prod_{word} P(word|class)
D. P(class) - \prod_{word} P(word|class)
Solution
Step 1: Recall Naive Bayes formula for text
The probability of a class given a document is proportional to the prior probability of the class times the product of the conditional probabilities of each word given the class.
Step 2: Match formula to options
P(class) * \prod_{word} P(word|class) correctly shows multiplication (product) of P(word|class) terms with P(class).
Final Answer:
P(class) * \prod_{word} P(word|class) -> Option C
Quick Check:
Naive Bayes uses product of word probabilities [OK]
Hint: Multiply class prior by product of word likelihoods [OK]
Common Mistakes:
Adding probabilities instead of multiplying
Dividing probabilities incorrectly
Subtracting probabilities
3. Given the following code snippet using sklearn's MultinomialNB for text classification, what will be the predicted class for the input text ['love this movie']?
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
texts = ['I love this movie', 'I hate this movie']
labels = ['positive', 'negative']
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
model = MultinomialNB()
model.fit(X, labels)
new_text = vectorizer.transform(['love this movie'])
prediction = model.predict(new_text)
print(prediction[0])
medium
A. movie
B. negative
C. hate
D. positive
Solution
Step 1: Understand training data and labels
The model is trained on two texts: one labeled 'positive' and one 'negative'. The words 'love' and 'hate' are key indicators.
Step 2: Analyze prediction input
The input text 'love this movie' contains the word 'love' which appeared in the positive example, so the model predicts 'positive'.
Final Answer:
positive -> Option D
Quick Check:
Word 'love' matches positive class [OK]
Hint: Check which class words in input appeared during training [OK]
Common Mistakes:
Confusing label names with words
Ignoring vectorizer transformation
Predicting word instead of class
4. Consider this code snippet using Naive Bayes for text classification:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
texts = ['spam spam spam', 'ham ham ham']
labels = ['spam', 'ham']
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
model = MultinomialNB()
model.fit(X, labels)
new_text = vectorizer.transform(['spam ham spam'])
prediction = model.predict(new_text)
print(prediction[0])
The output is unexpected. What is the likely cause?
medium
A. The input text contains words from both classes causing confusion
B. The vectorizer did not fit on the training data
C. MultinomialNB requires numeric labels, not strings
D. The model cannot handle words not seen in training
Solution
Step 1: Analyze training and input data
The training data has clear spam and ham texts. The input text mixes words from both classes.
Step 2: Understand Naive Bayes behavior with mixed words
Naive Bayes calculates probabilities for each class. Mixed words can cause the model to be uncertain or pick the class with higher prior or likelihood.
Final Answer:
The input text contains words from both classes causing confusion -> Option A
Quick Check:
Mixed class words confuse Naive Bayes prediction [OK]
Hint: Mixed class words can confuse Naive Bayes predictions [OK]
Common Mistakes:
Assuming unseen words cause error
Thinking vectorizer was not fitted
Believing labels must be numeric
5. You want to improve a Naive Bayes text classifier that often misclassifies short texts with rare words. Which approach is best to reduce this problem?
hard
A. Use Laplace smoothing to handle rare or unseen words
B. Remove all stop words from the training data
C. Increase the number of classes to make classification finer
D. Use raw word counts without normalization
Solution
Step 1: Identify problem with rare words
Rare or unseen words can cause zero probabilities, making Naive Bayes assign zero probability to classes incorrectly.
Step 2: Apply Laplace smoothing
Laplace smoothing adds a small count to all words, preventing zero probabilities and improving classification on rare words.
Final Answer:
Use Laplace smoothing to handle rare or unseen words -> Option A
Quick Check:
Laplace smoothing fixes zero probability issues [OK]
Hint: Add smoothing to avoid zero probabilities for rare words [OK]
Common Mistakes:
Thinking removing stop words fixes rare word issue