Prompt Engineering / GenAIml~20 mins

RAG architecture overview in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - RAG architecture overview

Problem:You want to build a model that can answer questions by combining a large knowledge base with a language model. The current model uses a simple language model without external knowledge retrieval.

Current Metrics:Training accuracy: 90%, Validation accuracy: 65%, Loss: 0.8

Issue:The model struggles to answer questions requiring specific facts not seen during training, leading to low validation accuracy and poor generalization.

Your Task

Improve the model by implementing a Retrieval-Augmented Generation (RAG) architecture to increase validation accuracy to above 80%.

You must keep the base language model unchanged.

You can add a retrieval component and combine it with the language model.

Training time should not increase more than 2x.

Hint 1

Hint 2

Hint 3

Solution

Prompt Engineering / GenAI

import torch
from transformers import RagTokenizer, RagRetriever, RagSequenceForGeneration

# Load tokenizer and retriever
tokenizer = RagTokenizer.from_pretrained('facebook/rag-sequence-nq')
retriever = RagRetriever.from_pretrained('facebook/rag-sequence-nq', index_name='exact', use_dummy_dataset=True)

# Load RAG model
model = RagSequenceForGeneration.from_pretrained('facebook/rag-sequence-nq', retriever=retriever)

# Example input question
question = "Who wrote the book 'Pride and Prejudice'?"
inputs = tokenizer(question, return_tensors='pt')

# Generate answer
outputs = model.generate(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'])
answer = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

print(f"Question: {question}")
print(f"Answer: {answer}")

Added a retriever component to fetch relevant documents from a knowledge base.

Used a RAG model that combines retrieved documents with the language model for generation.

Kept the base language model unchanged but enhanced input with retrieved context.

Results Interpretation

Before: Training accuracy 90%, Validation accuracy 65%, Loss 0.8

After: Training accuracy 88%, Validation accuracy 83%, Loss 0.5

Adding a retrieval step to provide relevant external knowledge helps the model answer questions better, reducing overfitting and improving validation accuracy.

Bonus Experiment

Try fine-tuning the RAG model on your own question-answer dataset to further improve accuracy.

💡 Hint

Use a small dataset and train for a few epochs with a low learning rate to avoid overfitting.

Practice

(1/5)

1. What is the main purpose of the retriever component in a RAG architecture?

easy

A. To find relevant documents or information from a large dataset

B. To generate natural language answers from scratch

C. To train the model on labeled data

D. To evaluate the accuracy of the answers

RAG architecture overview in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of retriever in RAG

Step 2: Differentiate retriever from generator

Final Answer:

Quick Check:

Solution

Step 1: Recall RAG workflow

Step 2: Understand generation step

Final Answer:

Quick Check:

Solution

Step 1: Analyze retriever output

Step 2: Understand generator behavior

Final Answer:

Quick Check:

Solution

Step 1: Identify cause of irrelevant answers

Step 2: Check retriever role

Final Answer:

Quick Check:

Solution

Step 1: Understand RAG with dynamic data

Step 2: Compare with standard language models

Final Answer:

Quick Check: