Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Combining retrieved context with LLM in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Combining retrieved context with LLM
Problem:You want to improve a language model's answers by adding relevant information retrieved from a document database before generating the response.
Current Metrics:Model answers are generic and sometimes miss key facts. No evaluation metric yet.
Issue:The language model does not use external context, so answers lack specific details and accuracy.
Your Task
Combine retrieved context with the language model input to improve answer relevance and accuracy. Measure improvement by comparing answer quality on a test set.
Use a simple retrieval method (e.g., keyword search or embedding similarity).
Combine retrieved text with the prompt before passing to the LLM.
Do not change the LLM architecture or training.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import openai

# Sample documents
documents = [
    "The Eiffel Tower is located in Paris.",
    "The Great Wall of China is visible from space.",
    "Python is a popular programming language.",
    "The Amazon rainforest is the largest tropical rainforest."
]

# Simple keyword-based retrieval function
def retrieve_context(question, docs, top_k=1):
    question_words = set(question.lower().split())
    scored_docs = []
    for doc in docs:
        doc_words = set(doc.lower().split())
        score = len(question_words.intersection(doc_words))
        scored_docs.append((score, doc))
    scored_docs.sort(key=lambda x: x[0], reverse=True)
    top_docs = [doc for score, doc in scored_docs if score > 0][:top_k]
    return ' '.join(top_docs) if top_docs else ''

# Function to generate answer with context

def generate_answer(question):
    context = retrieve_context(question, documents)
    prompt = f"Context: {context}\nQuestion: {question}\nAnswer:"
    response = openai.Completion.create(
        model="text-davinci-003",
        prompt=prompt,
        max_tokens=50,
        temperature=0
    )
    return response.choices[0].text.strip()

# Example usage
question = "Where is the Eiffel Tower located?"
answer = generate_answer(question)
print(f"Question: {question}")
print(f"Answer: {answer}")
Added a simple retrieval function to find relevant documents based on question keywords.
Combined retrieved context with the question in the prompt sent to the LLM.
Kept the LLM model and parameters unchanged.
Fixed sorting in retrieval function to specify key for sorting.
Results Interpretation

Before: Answers were generic and sometimes incorrect because the model had no extra information.

After: Answers include relevant facts from retrieved documents, making them more accurate and helpful.

Adding relevant external context to the language model input helps it generate better, more accurate answers without changing the model itself.
Bonus Experiment
Try using embedding-based similarity search instead of keyword matching for retrieval to improve context relevance.
💡 Hint
Use sentence embeddings from a library like SentenceTransformers to find documents closest in meaning to the question.

Practice

(1/5)
1. Why do we combine retrieved context with a large language model (LLM)?
easy
A. To give the model extra information it did not learn before
B. To make the model run faster
C. To reduce the size of the model
D. To replace the model's training data

Solution

  1. Step 1: Understand the purpose of retrieved context

    Retrieved context provides additional information that the model might not have seen during training.
  2. Step 2: Connect context to model output quality

    Providing this extra information helps the model give better and more accurate answers.
  3. Final Answer:

    To give the model extra information it did not learn before -> Option A
  4. Quick Check:

    Extra info improves answers = D [OK]
Hint: Extra info helps model answer better [OK]
Common Mistakes:
  • Thinking context speeds up the model
  • Believing context shrinks the model size
  • Assuming context replaces training data
2. Which of the following is the correct way to combine retrieved context with an LLM prompt?
easy
A. prompt = question * context
B. prompt = question + context
C. prompt = context + ' ' + question
D. prompt = context - question

Solution

  1. Step 1: Understand prompt construction

    The prompt should start with the context followed by the question to give the model relevant info first.
  2. Step 2: Check syntax correctness

    Using string concatenation with '+' is correct; multiplication or subtraction of strings is invalid.
  3. Final Answer:

    prompt = context + ' ' + question -> Option C
  4. Quick Check:

    Context before question with '+' = A [OK]
Hint: Concatenate context and question with + [OK]
Common Mistakes:
  • Putting question before context
  • Using * or - operators on strings
  • Not adding space between context and question
3. Given the code below, what will be the output?
context = 'The capital of France is Paris.'
question = 'What is the capital of France?'
prompt = context + ' ' + question
response = llm.generate(prompt)
print(response)
Assuming llm.generate() returns the model's answer, what is the likely output?
medium
A. Paris
B. London
C. I don't know
D. Error: undefined variable

Solution

  1. Step 1: Analyze the prompt content

    The prompt includes the context 'The capital of France is Paris.' followed by the question.
  2. Step 2: Predict model output based on context

    The model uses the context to answer correctly with 'Paris'.
  3. Final Answer:

    Paris -> Option A
  4. Quick Check:

    Context guides answer = Paris [OK]
Hint: Context gives correct answer to question [OK]
Common Mistakes:
  • Ignoring context and guessing wrong
  • Assuming code error without cause
  • Thinking model says 'I don't know'
4. You wrote this code to combine context with a question:
context = 'Water boils at 100 degrees Celsius.'
question = 'At what temperature does water boil?'
prompt = question + ' ' + context
response = llm.generate(prompt)
print(response)
Why might the model give a less accurate answer?
medium
A. Because the context is missing important info
B. Because the question comes before the context, confusing the model
C. Because the model cannot handle string concatenation
D. Because the prompt is too short

Solution

  1. Step 1: Check prompt order

    The prompt puts the question before the context, which may confuse the model about what info to use.
  2. Step 2: Understand best practice

    Context should come first to provide relevant info before the question.
  3. Final Answer:

    Because the question comes before the context, confusing the model -> Option B
  4. Quick Check:

    Context before question improves accuracy = B [OK]
Hint: Put context before question in prompt [OK]
Common Mistakes:
  • Thinking model can't concatenate strings
  • Assuming context lacks info
  • Believing prompt length is the issue
5. You want to build a system that answers questions about a company's products using an LLM. You have a large product manual. What is the best way to combine the manual with the LLM to get accurate answers?
hard
A. Train a new LLM from scratch on the manual
B. Feed the entire manual as a prompt to the LLM every time
C. Only ask the question without any manual context
D. Retrieve relevant sections from the manual and add them as context before the question in the prompt

Solution

  1. Step 1: Consider prompt size limits

    Feeding the entire manual is too large and inefficient for the LLM prompt.
  2. Step 2: Use retrieval to select relevant info

    Retrieving relevant sections and adding them as context helps the model answer accurately without overload.
  3. Step 3: Evaluate other options

    Asking without context misses info; training new LLM is costly and unnecessary.
  4. Final Answer:

    Retrieve relevant sections from the manual and add them as context before the question in the prompt -> Option D
  5. Quick Check:

    Relevant context retrieval + LLM = A [OK]
Hint: Retrieve relevant info, then prompt LLM [OK]
Common Mistakes:
  • Trying to input entire manual at once
  • Ignoring context and asking only question
  • Thinking retraining is always needed