Multilingual models help computers understand and work with many languages at once. This saves time and effort compared to building separate models for each language.
Multilingual models in NLP
from transformers import AutoModelForSequenceClassification, AutoTokenizer model_name = 'xlm-roberta-base' tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2) inputs = tokenizer('Hello, how are you?', return_tensors='pt') outputs = model(**inputs)
This example uses the Hugging Face Transformers library, which supports many multilingual models.
Replace 'xlm-roberta-base' with other multilingual model names as needed.
model_name = 'bert-base-multilingual-cased' tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
model_name = 'bert-base-multilingual-uncased' tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
text = 'Bonjour, comment ça va?' inputs = tokenizer(text, return_tensors='pt') outputs = model(**inputs)
This code loads a multilingual model that can classify text. It processes English, Spanish, and French sentences together. The output shows the raw scores (logits) and the predicted class for each sentence.
from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load a multilingual model and tokenizer model_name = 'xlm-roberta-base' tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2) # Example texts in different languages texts = ['Hello, how are you?', 'Hola, ¿cómo estás?', 'Bonjour, comment ça va?'] # Tokenize inputs inputs = tokenizer(texts, padding=True, truncation=True, return_tensors='pt') # Run model outputs = model(**inputs) # Get logits and predicted classes logits = outputs.logits decision = torch.argmax(logits, dim=1) print('Logits:', logits) print('Predicted classes:', decision)
Multilingual models share knowledge across languages, which helps especially for languages with less data.
They may not be as accurate as models trained only on one language but are very useful for many-language tasks.
Always check if the model supports the languages you need before using it.
Multilingual models let you handle many languages with one model.
They save time and resources compared to separate models for each language.
Use libraries like Hugging Face Transformers to easily load and use these models.