0
0
Prompt Engineering / GenAIml~12 mins

RAG architecture overview in Prompt Engineering / GenAI - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - RAG architecture overview

The RAG (Retrieval-Augmented Generation) architecture combines a search step with a text generation step. It first finds relevant documents from a large collection, then uses those documents to help create better answers.

Data Flow - 4 Stages
1Input Query
1 query stringUser provides a question or prompt1 query string
"What is the capital of France?"
2Document Retrieval
1 query stringSearches a large document database for relevant textsTop 5 documents (5 texts)
["Paris is the capital of France.", "France is in Europe.", "The Eiffel Tower is in Paris.", "French culture is rich.", "Paris has many museums."]
3Context Preparation
1 query string + 5 documentsCombines query with retrieved documents to form input for generation1 combined input string
"Question: What is the capital of France? Context: Paris is the capital of France. France is in Europe. The Eiffel Tower is in Paris. French culture is rich. Paris has many museums."
4Text Generation
1 combined input stringGenerates an answer using a language model conditioned on the context1 answer string
"The capital of France is Paris."
Training Trace - Epoch by Epoch
Loss: 2.3 |****     
       1.8 |******   
       1.4 |******** 
       1.1 |*********
       0.9 |*********
EpochLoss ↓Accuracy ↑Observation
12.30.25Model starts learning, loss is high, accuracy low
21.80.40Loss decreases, accuracy improves
31.40.55Model learns better context usage
41.10.65Continued improvement in generation quality
50.90.72Model converges with good retrieval and generation
Prediction Trace - 4 Layers
Layer 1: Input Query
Layer 2: Document Retrieval
Layer 3: Context Preparation
Layer 4: Text Generation
Model Quiz - 3 Questions
Test your understanding
What is the main role of the document retrieval step in RAG?
AFind relevant documents to help answer the query
BGenerate the final answer text
CClean the input query
DEvaluate model accuracy
Key Insight
RAG improves answer quality by first finding helpful documents, then using them as context for generation. This two-step approach helps the model give more accurate and informative responses.