NlpHow-ToBeginner · 3 min read

How to Use BERT for NLP Tasks: Simple Guide and Example

To use BERT for NLP, load a pre-trained BERT model and its tokenizer, then convert text into tokens that the model understands. Pass these tokens to the model to get meaningful outputs like embeddings or predictions for tasks such as classification or question answering.

📐

Syntax

Using BERT involves these main steps:

Load tokenizer: Converts text into tokens BERT understands.
Tokenize input: Prepare text as input IDs and attention masks.
Load model: Pre-trained BERT model for your NLP task.
Run model: Pass tokens to get outputs like embeddings or predictions.

python

from transformers import BertTokenizer, BertModel

# Load pre-trained tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

# Example text
text = "Hello, how are you?"

# Tokenize text
inputs = tokenizer(text, return_tensors='pt')

# Get model outputs
outputs = model(**inputs)

# Extract last hidden states (embeddings)
embeddings = outputs.last_hidden_state

💻

Example

This example shows how to use BERT to get word embeddings from a sentence. These embeddings can be used for many NLP tasks like classification or similarity.

python

from transformers import BertTokenizer, BertModel
import torch

# Load tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

# Input sentence
sentence = "BERT helps computers understand language."

# Tokenize and get input tensors
inputs = tokenizer(sentence, return_tensors='pt')

# Run model to get outputs
with torch.no_grad():
    outputs = model(**inputs)

# Get embeddings for each token
embeddings = outputs.last_hidden_state

# Print shape of embeddings tensor
print(f"Embeddings shape: {embeddings.shape}")

Output

Embeddings shape: torch.Size([1, 9, 768])

⚠️

Common Pitfalls

Common mistakes when using BERT include:

Not using the correct tokenizer matching the model.
Feeding raw text directly to the model without tokenization.
Ignoring attention masks, which tell the model which tokens to focus on.
Not setting the model to evaluation mode during inference, which can affect results.

Always use the tokenizer from the same model checkpoint and pass attention masks to the model.

python

from transformers import BertTokenizer, BertModel

# Wrong way: feeding raw text directly
# model('Hello world')  # This will cause an error

# Right way:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

text = "Hello world"
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)

📊

Quick Reference

Key points to remember when using BERT:

Always load tokenizer and model from the same pre-trained checkpoint.
Use tokenizer(text, return_tensors='pt') to prepare inputs.
Pass input_ids and attention_mask to the model.
Use outputs.last_hidden_state for embeddings.
Set model to eval() mode during inference.

✅

Key Takeaways

Load BERT tokenizer and model from the same pre-trained checkpoint for compatibility.

Always tokenize text before passing it to the BERT model using the tokenizer.

Use attention masks to help BERT focus on real tokens, ignoring padding.

Extract embeddings from the model's last hidden state for downstream NLP tasks.

Set the model to evaluation mode during inference to get consistent results.