0
0
Prompt Engineering / GenAIml~20 mins

Combining retrieved context with LLM in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Combining retrieved context with LLM
Problem:You want to improve a language model's answers by adding relevant information retrieved from a document database before generating the response.
Current Metrics:Model answers are generic and sometimes miss key facts. No evaluation metric yet.
Issue:The language model does not use external context, so answers lack specific details and accuracy.
Your Task
Combine retrieved context with the language model input to improve answer relevance and accuracy. Measure improvement by comparing answer quality on a test set.
Use a simple retrieval method (e.g., keyword search or embedding similarity).
Combine retrieved text with the prompt before passing to the LLM.
Do not change the LLM architecture or training.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import openai

# Sample documents
documents = [
    "The Eiffel Tower is located in Paris.",
    "The Great Wall of China is visible from space.",
    "Python is a popular programming language.",
    "The Amazon rainforest is the largest tropical rainforest."
]

# Simple keyword-based retrieval function
def retrieve_context(question, docs, top_k=1):
    question_words = set(question.lower().split())
    scored_docs = []
    for doc in docs:
        doc_words = set(doc.lower().split())
        score = len(question_words.intersection(doc_words))
        scored_docs.append((score, doc))
    scored_docs.sort(key=lambda x: x[0], reverse=True)
    top_docs = [doc for score, doc in scored_docs if score > 0][:top_k]
    return ' '.join(top_docs) if top_docs else ''

# Function to generate answer with context

def generate_answer(question):
    context = retrieve_context(question, documents)
    prompt = f"Context: {context}\nQuestion: {question}\nAnswer:"
    response = openai.Completion.create(
        model="text-davinci-003",
        prompt=prompt,
        max_tokens=50,
        temperature=0
    )
    return response.choices[0].text.strip()

# Example usage
question = "Where is the Eiffel Tower located?"
answer = generate_answer(question)
print(f"Question: {question}")
print(f"Answer: {answer}")
Added a simple retrieval function to find relevant documents based on question keywords.
Combined retrieved context with the question in the prompt sent to the LLM.
Kept the LLM model and parameters unchanged.
Fixed sorting in retrieval function to specify key for sorting.
Results Interpretation

Before: Answers were generic and sometimes incorrect because the model had no extra information.

After: Answers include relevant facts from retrieved documents, making them more accurate and helpful.

Adding relevant external context to the language model input helps it generate better, more accurate answers without changing the model itself.
Bonus Experiment
Try using embedding-based similarity search instead of keyword matching for retrieval to improve context relevance.
💡 Hint
Use sentence embeddings from a library like SentenceTransformers to find documents closest in meaning to the question.