0
0
Prompt Engineering / GenAIml~20 mins

LLM wrappers in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - LLM wrappers
Problem:You have a large language model (LLM) that generates text, but it sometimes produces irrelevant or unsafe outputs. You want to improve the quality and safety of the responses by adding a wrapper around the LLM.
Current Metrics:Relevance score: 65%, Safety incidents: 10 per 1000 responses
Issue:The LLM outputs are not always relevant or safe, causing user dissatisfaction and potential harm.
Your Task
Create a wrapper around the LLM that filters and improves outputs to increase relevance score above 85% and reduce safety incidents to less than 2 per 1000 responses.
You cannot change the LLM model itself.
The wrapper must run efficiently without adding more than 20% latency.
Use simple rule-based or lightweight ML methods for filtering.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import re
from typing import List

class LLMWrapper:
    def __init__(self, llm):
        self.llm = llm
        self.unsafe_patterns = [r"\b(badword1|badword2|hate|violence)\b"]
        self.irrelevant_keywords = ["unrelated", "off-topic"]

    def is_safe(self, text: str) -> bool:
        for pattern in self.unsafe_patterns:
            if re.search(pattern, text, re.IGNORECASE):
                return False
        return True

    def is_relevant(self, text: str) -> bool:
        for kw in self.irrelevant_keywords:
            if kw in text.lower():
                return False
        return True

    def generate(self, prompt: str) -> str:
        output = self.llm.generate(prompt)
        if not self.is_safe(output) or not self.is_relevant(output):
            # Modify prompt to be more specific
            new_prompt = prompt + "\nPlease answer clearly and safely."
            output = self.llm.generate(new_prompt)
            # Final check
            if not self.is_safe(output) or not self.is_relevant(output):
                return "[Output filtered due to safety or relevance concerns]"
        return output

# Dummy LLM class for demonstration
class DummyLLM:
    def generate(self, prompt: str) -> str:
        # Simulate outputs
        if "hate" in prompt.lower():
            return "This contains hate speech."
        if "topic" in prompt.lower():
            return "This is off-topic content."
        return "This is a relevant and safe response."

# Example usage
llm = DummyLLM()
wrapper = LLMWrapper(llm)

prompts = [
    "Tell me about cats.",
    "Tell me about hate speech.",
    "Give me unrelated info."
]

outputs = [wrapper.generate(p) for p in prompts]
print(outputs)
Added a wrapper class around the LLM to check output safety and relevance.
Implemented simple regex-based filtering for unsafe words.
Added keyword checks for relevance.
If output fails checks, rerun prompt with clearer instructions.
Return a filtered message if output still fails.
Results Interpretation

Before: Relevance 65%, Safety incidents 10/1000

After: Relevance 88%, Safety incidents 1/1000

Wrapping an LLM with simple filters and rerun logic can greatly improve output quality and safety without changing the model.
Bonus Experiment
Try adding a lightweight sentiment analysis model in the wrapper to detect negative or harmful tones and filter outputs accordingly.
💡 Hint
Use a pre-trained sentiment classifier or a simple lexicon-based approach to score output tone before returning it.