0
0
NLPml~5 mins

T5 for text-to-text tasks in NLP

Choose your learning style9 modes available
Introduction
T5 turns all language tasks into a simple text input and text output format. This makes it easy to teach a model to do many things like translation, summarization, or answering questions.
You want to translate sentences from one language to another.
You need to summarize a long article into a short paragraph.
You want to answer questions based on a given text.
You want to convert one style of text into another, like changing formal text to casual.
You want a single model that can handle many different language tasks.
Syntax
NLP
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load model and tokenizer
model = T5ForConditionalGeneration.from_pretrained('t5-small')
tokenizer = T5Tokenizer.from_pretrained('t5-small')

# Prepare input text with task prefix
input_text = 'translate English to German: The house is wonderful.'
input_ids = tokenizer(input_text, return_tensors='pt').input_ids

# Generate output
outputs = model.generate(input_ids)

# Decode output
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
T5 uses a prefix in the input text to tell the model what task to do, like 'translate English to German:'.
The model generates text output, so all tasks are treated as text-to-text.
Examples
This input tells T5 to summarize the given sentence.
NLP
input_text = 'summarize: Machine learning is a method of teaching computers to learn from data.'
This input tells T5 to translate the English sentence into French.
NLP
input_text = 'translate English to French: How are you today?'
This input asks T5 to answer a question using the given context.
NLP
input_text = 'question: What is the capital of France? context: France is a country in Europe. Its capital is Paris.'
Sample Model
This program shows how to use T5 for two tasks: translating English to German and summarizing a sentence. It loads the model, prepares inputs with task prefixes, generates outputs, and prints the results.
NLP
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load the small T5 model and tokenizer
model = T5ForConditionalGeneration.from_pretrained('t5-small')
tokenizer = T5Tokenizer.from_pretrained('t5-small')

# Example input: translate English to German
input_text = 'translate English to German: The house is wonderful.'
input_ids = tokenizer(input_text, return_tensors='pt').input_ids

# Generate translation
outputs = model.generate(input_ids, max_length=40)

# Decode and print result
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print('Translation:', result)

# Example input: summarize text
input_text2 = 'summarize: Machine learning helps computers learn from data to make decisions.'
input_ids2 = tokenizer(input_text2, return_tensors='pt').input_ids
outputs2 = model.generate(input_ids2, max_length=20)
summary = tokenizer.decode(outputs2[0], skip_special_tokens=True)
print('Summary:', summary)
OutputSuccess
Important Notes
Always add a clear task prefix in the input text so T5 knows what to do.
Use the tokenizer to convert text to tokens and back to text after generation.
The 't5-small' model is good for learning and small tasks; bigger models exist for better results.
Summary
T5 treats all language tasks as text input and text output.
You tell T5 what to do by adding a task prefix in the input text.
T5 can do many tasks like translation, summarization, and question answering with one model.