Model Pipeline - Spam detection pipeline
This spam detection pipeline takes email text and decides if it is spam or not. It cleans the text, turns words into numbers, trains a model to learn patterns, and then predicts new emails as spam or not.
Jump into concepts and practice - no test required
This spam detection pipeline takes email text and decides if it is spam or not. It cleans the text, turns words into numbers, trains a model to learn patterns, and then predicts new emails as spam or not.
Loss
0.5 |****
0.4 |***
0.3 |**
0.2 |*
0.1 |
1 2 3 4 5 Epochs
| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | 0.45 | 0.78 | Model starts learning basic spam patterns |
| 2 | 0.32 | 0.85 | Loss decreases, accuracy improves |
| 3 | 0.25 | 0.89 | Model captures more features |
| 4 | 0.20 | 0.91 | Training stabilizes with good accuracy |
| 5 | 0.18 | 0.92 | Final epoch with best performance |
Pipeline with a TfidfVectorizer and a LogisticRegression model?print(predictions) if the input messages are ["Win a free prize now", "Meeting at noon"] and the model predicts 1 for spam and 0 for not spam?from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
pipeline = Pipeline([
('vectorizer', TfidfVectorizer()),
('model', LogisticRegression())
])
# Assume pipeline is already trained
messages = ["Win a free prize now", "Meeting at noon"]
predictions = pipeline.predict(messages)
print(predictions)from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
pipeline = Pipeline([
('vectorizer', CountVectorizer),
('model', LogisticRegression())
])
pipeline.fit(train_messages, train_labels)CountVectorizer with stop words removal?stop_words which can be set to 'english' to remove common English stop words automatically.stop_words='english' inside CountVectorizer. Other options either use a non-existent StopWordsRemover step or set stop_words=None, which disables removal.