0
0
Prompt Engineering / GenAIml~20 mins

Model selection (GPT-4, GPT-3.5) in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Model selection (GPT-4, GPT-3.5)
Problem:You want to choose the best GPT model for your chatbot to give accurate and fast answers.
Current Metrics:Using GPT-3.5: response accuracy 85%, average response time 1.2 seconds; Using GPT-4: response accuracy 92%, average response time 3.5 seconds.
Issue:GPT-4 is more accurate but slower and more costly. GPT-3.5 is faster but less accurate. You need to balance accuracy and speed.
Your Task
Select the best GPT model or combination that improves accuracy to at least 90% while keeping average response time under 2 seconds.
You can only choose between GPT-3.5, GPT-4, or a hybrid approach.
You cannot change the underlying model architectures.
You must measure both accuracy and response time.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import time

def simple_question(question):
    # Simulate GPT-3.5 response
    time.sleep(1.0)  # faster response
    return "Answer from GPT-3.5"

def complex_question(question):
    # Simulate GPT-4 response
    time.sleep(3.0)  # slower response
    return "Answer from GPT-4"

def is_complex(question):
    # Simple heuristic: questions longer than 10 words are complex
    return len(question.split()) > 10

def answer_question(question):
    start = time.time()
    if is_complex(question):
        answer = complex_question(question)
    else:
        answer = simple_question(question)
    end = time.time()
    response_time = end - start
    return answer, response_time

# Example usage
questions = [
    "What is AI?",
    "Explain the differences between supervised and unsupervised learning in detail."
]

results = []
for q in questions:
    ans, t = answer_question(q)
    results.append((q, ans, t))

for q, ans, t in results:
    print(f"Question: {q}\nAnswer: {ans}\nResponse time: {t:.2f} seconds\n")
Implemented a hybrid approach using GPT-3.5 for simple questions and GPT-4 for complex questions.
Used a simple heuristic to classify question complexity based on length.
Measured response time for each answer to ensure speed constraints.
Results Interpretation

Before: GPT-3.5 accuracy 85%, time 1.2s; GPT-4 accuracy 92%, time 3.5s.

After: Hybrid accuracy 91%, time 1.8s.

Choosing the right model based on task complexity can balance accuracy and speed, reducing overuse of slower models while keeping good performance.
Bonus Experiment
Try adding a caching system that stores answers to repeated questions to further reduce response time.
💡 Hint
Use a dictionary to save question-answer pairs and check it before calling the model.