0
0
Prompt Engineering / GenAIml~10 mins

Multimodal RAG in Prompt Engineering / GenAI - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to load a pretrained multimodal retriever model.

Prompt Engineering / GenAI
from transformers import [1]
retriever = [1].from_pretrained('multimodal-rag-base')
Drag options to blanks, or click blank then click option'
ARagSequenceForGeneration
BRagTokenizer
CRagRetriever
DRagConfig
Attempts:
3 left
💡 Hint
Common Mistakes
Using RagTokenizer instead of RagRetriever.
Using RagConfig which is only for configuration.
2fill in blank
medium

Complete the code to encode an image input for the multimodal retriever.

Prompt Engineering / GenAI
from PIL import Image
image = Image.open('input.jpg')
inputs = retriever.image_encoder.[1](image, return_tensors='pt')
Drag options to blanks, or click blank then click option'
Aforward
Bencode
Cprocess
Dencode_image
Attempts:
3 left
💡 Hint
Common Mistakes
Using encode which is generic and may not exist.
Using forward directly without preprocessing.
3fill in blank
hard

Fix the error in the code to retrieve documents using the multimodal retriever.

Prompt Engineering / GenAI
retrieved_docs = retriever.[1](inputs['pixel_values'], top_k=5)
Drag options to blanks, or click blank then click option'
Aget_relevant_documents
Bretrieve_docs
Cretrieve_documents
Dretrieve
Attempts:
3 left
💡 Hint
Common Mistakes
Using retrieve which is not a method.
Using retrieve_docs which does not exist.
4fill in blank
hard

Fill both blanks to create a multimodal RAG model and generate an answer.

Prompt Engineering / GenAI
from transformers import RagSequenceForGeneration
model = RagSequenceForGeneration.from_pretrained('multimodal-rag-base')
outputs = model.generate([1], [2]=retrieved_docs)
Drag options to blanks, or click blank then click option'
Ainput_ids
Bcontext_input_ids
Cretrieved_docs
Dcontext_docs
Attempts:
3 left
💡 Hint
Common Mistakes
Passing retrieved_docs as a positional argument instead of context_input_ids.
Using wrong keyword arguments.
5fill in blank
hard

Fill all three blanks to prepare inputs, retrieve documents, and generate answers in a multimodal RAG pipeline.

Prompt Engineering / GenAI
question = 'What is shown in the image?'
inputs = retriever.question_encoder.tokenizer(question, return_tensors='pt')
image_inputs = retriever.image_encoder.[1](image, return_tensors='pt')
retrieved_docs = retriever.get_relevant_documents([2])
outputs = model.generate(input_ids=inputs['input_ids'], [3]=retrieved_docs)
Drag options to blanks, or click blank then click option'
Aencode_image
Bimage_inputs['pixel_values']
Ccontext_input_ids
Dinput_ids
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the whole image_inputs dict instead of pixel_values.
Using input_ids instead of context_input_ids for retrieved docs.