Practice

(1/5)

1. What is the main purpose of Retrieval-Augmented Generation (RAG) in large language models?

easy

A. To make the model run faster by skipping data retrieval

B. To connect the model to real data for more accurate answers

C. To reduce the size of the language model

D. To generate random text without any input

Solution

Step 1: Understand RAG's role
RAG helps language models by retrieving relevant real data before generating answers.
Step 2: Connect purpose to options
Only To connect the model to real data for more accurate answers mentions connecting to real data for accuracy, which matches RAG's goal.
Final Answer:
To connect the model to real data for more accurate answers -> Option B
Quick Check:
RAG purpose = connect to real data [OK]

Hint: RAG links models to real info for better answers [OK]

Common Mistakes:

Thinking RAG speeds up model without retrieval
Confusing RAG with model size reduction
Believing RAG generates random text

2. Which step is NOT part of the RAG process in grounding LLMs?

easy

A. Retrieving relevant documents from a database

B. Adding retrieved information to the model's input

C. Generating output based on combined input and data

D. Training the model from scratch every time

Solution

Step 1: Recall RAG process steps
RAG retrieves data, adds it to input, then generates output without retraining.
Step 2: Identify the incorrect step
Training the model from scratch every time says training from scratch every time, which is not part of RAG's normal use.
Final Answer:
Training the model from scratch every time -> Option D
Quick Check:
RAG skips retraining each query [OK]

Hint: RAG retrieves and generates, no retraining each time [OK]

Common Mistakes:

Confusing retrieval with training
Thinking RAG modifies model weights every query
Ignoring the retrieval step

3. Given this simplified RAG workflow code snippet, what will be printed?

retrieved_docs = ['Data about cats', 'Info on dogs']
input_text = 'Tell me about pets.'
combined_input = input_text + ' ' + ' '.join(retrieved_docs)
print(combined_input)

medium

A. Tell me about pets. Data about cats Info on dogs

B. Tell me about pets.['Data about cats', 'Info on dogs']

C. Tell me about pets.Data about catsInfo on dogs

D. Error: cannot join list of strings

Solution

Step 1: Understand string join operation
' '.join(retrieved_docs) joins list items with spaces, producing 'Data about cats Info on dogs'.
Step 2: Combine input_text and joined string
Adding input_text + ' ' + joined string results in 'Tell me about pets. Data about cats Info on dogs'.
Final Answer:
Tell me about pets. Data about cats Info on dogs -> Option A
Quick Check:
Join list with spaces = combined string [OK]

Hint: Join list with spaces to combine text [OK]

Common Mistakes:

Printing list directly without join
Missing spaces between strings
Assuming join causes error

4. Identify the error in this RAG-like code snippet:

def rag_generate(input_text, docs):
    combined = input_text + docs
    return combined

print(rag_generate('Info:', ['doc1', 'doc2']))

medium

A. Function missing return statement

B. docs should be a string, not a list

C. Cannot add string and list directly

D. No error, code runs fine

Solution

Step 1: Check data types in addition
input_text is a string, docs is a list; Python cannot add string + list directly.
Step 2: Identify error cause
Adding string and list causes a TypeError, so Cannot add string and list directly is correct.
Final Answer:
Cannot add string and list directly -> Option C
Quick Check:
String + list = TypeError [OK]

Hint: Check data types before adding strings and lists [OK]

Common Mistakes:

Thinking list concatenation works with strings
Ignoring Python type errors
Assuming function lacks return

5. In a RAG system, why is it important to ground the language model with up-to-date external data rather than relying solely on its training data?

hard

A. Because training data may be outdated and miss recent facts

B. Because external data makes the model run faster

C. Because training data is always incorrect

D. Because grounding removes the need for any model training

Solution

Step 1: Understand training data limits
Models learn from fixed training data that can become outdated over time.
Step 2: Explain grounding benefit
Grounding with fresh external data helps provide current, accurate answers beyond training knowledge.
Final Answer:
Because training data may be outdated and miss recent facts -> Option A
Quick Check:
Grounding updates info beyond training data [OK]

Hint: Grounding updates model with fresh facts [OK]

Common Mistakes:

Thinking external data speeds up model
Believing training data is always wrong
Assuming grounding replaces training

Epoch	Loss ↓	Accuracy ↑	Observation
1	1.2	0.45	Model starts learning to combine retrieval and generation.
2	0.9	0.60	Loss decreases as model improves grounding in retrieved data.
3	0.7	0.72	Model better integrates retrieved documents for accurate answers.
4	0.5	0.80	Training converges with improved factual accuracy.
5	0.4	0.85	Final epoch shows strong grounding and generation quality.

Why RAG grounds LLMs in real data in Prompt Engineering / GenAI - Model Pipeline Impact

Start learning this pattern below

Practice

Solution

Step 1: Understand RAG's role

Step 2: Connect purpose to options

Final Answer:

Quick Check:

Solution

Step 1: Recall RAG process steps

Step 2: Identify the incorrect step

Final Answer:

Quick Check:

Solution

Step 1: Understand string join operation

Step 2: Combine input_text and joined string

Final Answer:

Quick Check:

Solution

Step 1: Check data types in addition

Step 2: Identify error cause

Final Answer:

Quick Check:

Solution

Step 1: Understand training data limits

Step 2: Explain grounding benefit

Final Answer:

Quick Check: