Model Pipeline - Multilingual sentiment
This pipeline reads text data in multiple languages, cleans and converts it into numbers, trains a model to understand positive or negative feelings, and then predicts sentiment for new sentences.
Jump into concepts and practice - no test required
This pipeline reads text data in multiple languages, cleans and converts it into numbers, trains a model to understand positive or negative feelings, and then predicts sentiment for new sentences.
Loss
0.7 |****
0.6 |***
0.5 |**
0.4 |*
0.3 |
0.2 |
1 2 3 4 5 6 Epochs
| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | 0.65 | 0.60 | Model starts learning basic patterns |
| 2 | 0.50 | 0.72 | Accuracy improves as model adjusts weights |
| 3 | 0.40 | 0.80 | Model captures multilingual sentiment features |
| 4 | 0.32 | 0.85 | Loss decreases steadily, accuracy rises |
| 5 | 0.28 | 0.88 | Model converging with good performance |
| 6 | 0.25 | 0.90 | Final epoch with best validation accuracy |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
inputs = tokenizer("Je suis très content", return_tensors="pt")
outputs = model(**inputs)
probs = torch.nn.functional.softmax(outputs.logits, dim=1)
label = torch.argmax(probs).item() + 1 # labels 1 to 5
print(label)from transformers import AutoTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
inputs = tokenizer('Das ist schlecht', return_tensors='pt')
outputs = model(inputs)
What is the cause of the error?