Introduction
T5 turns all language tasks into a simple text input and text output format. This makes it easy to teach a model to do many things like translation, summarization, or answering questions.
Jump into concepts and practice - no test required
from transformers import T5Tokenizer, T5ForConditionalGeneration # Load model and tokenizer model = T5ForConditionalGeneration.from_pretrained('t5-small') tokenizer = T5Tokenizer.from_pretrained('t5-small') # Prepare input text with task prefix input_text = 'translate English to German: The house is wonderful.' input_ids = tokenizer(input_text, return_tensors='pt').input_ids # Generate output outputs = model.generate(input_ids) # Decode output result = tokenizer.decode(outputs[0], skip_special_tokens=True)
input_text = 'summarize: Machine learning is a method of teaching computers to learn from data.'input_text = 'translate English to French: How are you today?'input_text = 'question: What is the capital of France? context: France is a country in Europe. Its capital is Paris.'from transformers import T5Tokenizer, T5ForConditionalGeneration # Load the small T5 model and tokenizer model = T5ForConditionalGeneration.from_pretrained('t5-small') tokenizer = T5Tokenizer.from_pretrained('t5-small') # Example input: translate English to German input_text = 'translate English to German: The house is wonderful.' input_ids = tokenizer(input_text, return_tensors='pt').input_ids # Generate translation outputs = model.generate(input_ids, max_length=40) # Decode and print result result = tokenizer.decode(outputs[0], skip_special_tokens=True) print('Translation:', result) # Example input: summarize text input_text2 = 'summarize: Machine learning helps computers learn from data to make decisions.' input_ids2 = tokenizer(input_text2, return_tensors='pt').input_ids outputs2 = model.generate(input_ids2, max_length=20) summary = tokenizer.decode(outputs2[0], skip_special_tokens=True) print('Summary:', summary)
summarize: before the input text. correctly uses "summarize:"; others are for different tasks or invalid.translate English to German: The cat is on the mat. What is the expected output?summarize The quick brown fox jumps over the lazy dog. but the output is not a summary. What is the likely error?answer question: What is the capital of France? Context: Paris is the capital city of France. correctly includes the question and context with the proper prefix. Others either miss the prefix or use wrong tasks.answer question: What is the capital of France? Context: Paris is the capital city of France. [OK]