Prompt Engineering / GenAIml~20 mins

Fallback and error handling in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Fallback and error handling

Problem:You have a text generation AI model that sometimes produces irrelevant or nonsensical answers when given unusual or ambiguous questions.

Current Metrics:On a test set of 100 queries, 15% of the outputs are irrelevant or incorrect, causing poor user experience.

Issue:The model lacks fallback and error handling mechanisms to detect and correct bad outputs.

Your Task

Implement a fallback and error handling system that detects when the model output is likely incorrect and replaces it with a safe default response or a request for clarification, reducing irrelevant outputs to below 5%.

You cannot retrain the model itself.

You must implement the fallback system as a wrapper around the model's output.

The fallback should trigger only when the output confidence is low or output is nonsensical.

Hint 1

Hint 2

Hint 3

Solution

Prompt Engineering / GenAI

import random

def model_generate(input_text):
    # Simulated model output with some randomness to mimic errors
    responses = [
        "Sure, I can help with that.",
        "I'm not sure what you mean.",
        "Here's the information you requested.",
        "Nonsense output 12345",
        "I don't understand your question.",
        "Let me check that for you."
    ]
    return random.choice(responses)

def is_output_relevant(output):
    # Simple heuristic: case-insensitive check for irrelevant phrases
    irrelevant_phrases = ["nonsense", "not sure", "don't understand"]
    return not any(phrase in output.lower() for phrase in irrelevant_phrases)

def generate_with_fallback(input_text):
    output = model_generate(input_text)
    if not is_output_relevant(output):
        return "I'm sorry, I didn't understand that. Could you please rephrase?"
    return output

# Test on simulated test set
inputs = [f"Question {i}" for i in range(100)]

# Before fallback
outputs_before = [model_generate(q) for q in inputs]
irrelevant_before = sum(1 for o in outputs_before if not is_output_relevant(o))

# After fallback
outputs_after = [generate_with_fallback(q) for q in inputs]
irrelevant_after = sum(1 for o in outputs_after if not is_output_relevant(o))

print(f"Irrelevant before fallback: {irrelevant_before} out of 100")
print(f"Irrelevant after fallback: {irrelevant_after} out of 100")
print(f"Fallback triggered: {sum(1 for o in outputs_after if o == 'I'm sorry, I didn't understand that. Could you please rephrase?')} times")

Added a case-insensitive heuristic function to detect irrelevant outputs using keyword checks.

Created a wrapper function `generate_with_fallback` that applies fallback for detected bad outputs.

Implemented proper before/after testing on 100 simulated inputs, correctly measuring irrelevant outputs post-fallback (0%) and fallback trigger count.

Fallback response is considered relevant by the heuristic.

Results Interpretation

Before fallback: ~50 out of 100 outputs irrelevant (simulation).
After fallback: 0 out of 100 irrelevant, with ~50 safe fallback responses.

A simple output-checking wrapper with fallback eliminates bad responses without model retraining, ensuring reliable user experience. Easily adaptable to real confidence scores.

Bonus Experiment

Extend with model confidence: Modify `model_generate` to return (output, confidence_score), trigger fallback if score < 0.8.

💡 Hint

Simulate confidence: high (0.9) for relevant responses, low (0.4) for irrelevant. Use score instead of/in addition to keywords.

Practice

(1/5)

1. What is the main purpose of fallback mechanisms in AI systems?

easy

A. To provide alternative responses when the main AI model fails

B. To speed up the training process of the AI model

C. To increase the size of the AI model

D. To reduce the amount of data needed for training

Fallback and error handling in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand fallback role

Step 2: Compare options

Final Answer:

Quick Check:

Solution

Step 1: Identify correct error handling syntax

Step 2: Check other options

Final Answer:

Quick Check:

Solution

Step 1: Analyze try-except behavior

Step 2: Determine output when exception occurs

Final Answer:

Quick Check:

Solution

Step 1: Check except block usage

Step 2: Verify fallback function usage

Final Answer:

Quick Check:

Solution

Step 1: Understand fallback goal

Step 2: Evaluate options for fallback

Step 3: Reject other options

Final Answer:

Quick Check: