GenaiHow-ToBeginner · 4 min read

How RAG Works with Prompting: Simple Explanation and Example

Retrieval-Augmented Generation (RAG) works by first retrieving relevant documents from a database using a query, then combining those documents with a prompt to guide a language model in generating accurate and context-aware answers. The prompt includes the retrieved information and instructions to help the model produce better responses.

📐

Syntax

The basic syntax of RAG with prompting involves three parts:

Query: The user's question or input.
Retriever: A system that finds relevant documents or data based on the query.
Prompt: A text template that combines the retrieved documents with instructions to guide the language model's answer.

The language model then generates the output based on this combined prompt.

python

def rag_prompt(query, retrieved_docs):
    prompt = f"Use the following information to answer the question:\n{retrieved_docs}\nQuestion: {query}\nAnswer:"  
    return prompt

💻

Example

This example shows how RAG works by retrieving documents and creating a prompt for a language model to generate an answer.

python

def simple_retriever(query):
    knowledge_base = {
        "What is AI?": "AI stands for Artificial Intelligence, which means machines that can learn and think.",
        "What is RAG?": "RAG means Retrieval-Augmented Generation, combining search and language models.",
    }
    return knowledge_base.get(query, "No relevant information found.")


def rag_prompt(query, retrieved_docs):
    prompt = f"Use the following information to answer the question:\n{retrieved_docs}\nQuestion: {query}\nAnswer:"  
    return prompt


def language_model_generate(prompt):
    # Simulated language model output
    if "No relevant information" in prompt:
        return "Sorry, I don't have enough information to answer."
    else:
        return "Based on the information, " + prompt.split('Question: ')[1].split('\nAnswer:')[0] + " is explained as above."


# User query
query = "What is RAG?"

# Step 1: Retrieve relevant documents
retrieved_docs = simple_retriever(query)

# Step 2: Create prompt
prompt = rag_prompt(query, retrieved_docs)

# Step 3: Generate answer
answer = language_model_generate(prompt)

print(answer)

Output

Based on the information, What is RAG? is explained as above.

⚠️

Common Pitfalls

Common mistakes when using RAG with prompting include:

Not retrieving relevant documents: If the retriever returns unrelated or empty data, the model cannot generate a good answer.
Poor prompt design: If the prompt does not clearly instruct the model how to use the retrieved data, the output may be vague or incorrect.
Ignoring retrieval errors: Not handling cases when no documents are found can cause confusing answers.

Always check retrieval quality and design prompts that clearly connect the retrieved info to the question.

python

def rag_prompt_wrong(query, retrieved_docs):
    # Missing instructions and context
    prompt = f"{retrieved_docs} {query}"
    return prompt

# Correct way

def rag_prompt_right(query, retrieved_docs):
    prompt = f"Use the following information to answer the question:\n{retrieved_docs}\nQuestion: {query}\nAnswer:"  
    return prompt

📊

Quick Reference

Step	Description
Query	User asks a question or inputs text.
Retrieve	Find relevant documents or data related to the query.
Prompt	Combine retrieved data with instructions into a prompt.
Generate	Language model uses the prompt to create an answer.
Output	Return the generated answer to the user.

✅

Key Takeaways

RAG combines document retrieval with prompting to improve language model answers.

The prompt must clearly include retrieved information and instructions for best results.

Poor retrieval or unclear prompts lead to weak or incorrect answers.

Always handle cases where no relevant documents are found.

RAG helps models answer questions using up-to-date or specific external knowledge.