Bird
Raised Fist0
Prompt Engineering / GenAIml~8 mins

First interaction with GenAI APIs - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - First interaction with GenAI APIs
Which metric matters for this concept and WHY

When using GenAI APIs for the first time, the key metric to focus on is response relevance. This means how well the AI's answers match what you asked. Since GenAI often generates text, measuring exact correctness is tricky. Instead, you look at how useful and accurate the responses feel. Another important metric is latency, or how fast the API responds, because quick answers improve user experience.

Confusion matrix or equivalent visualization (ASCII)

For GenAI text generation, a confusion matrix is not typical. Instead, you can think of evaluation like this:

User Query: "What is the capital of France?"

Possible AI Responses:
- Correct: "Paris"
- Incorrect: "Berlin"

Evaluation:
- True Positive (TP): AI gives "Paris" when asked about France's capital.
- False Positive (FP): AI gives "Paris" when asked about Germany's capital.
- False Negative (FN): AI fails to say "Paris" when asked about France.
- True Negative (TN): AI correctly does not say "Paris" for unrelated questions.
    

This helps understand when the AI is right or wrong in context.

Precision vs Recall tradeoff with concrete examples

In GenAI APIs, precision means how often the AI's answers are correct when it gives an answer. Recall means how often the AI provides an answer when it should.

Example: If you ask many questions, and the AI only answers some, it might have high precision (answers are mostly right) but low recall (misses many questions).

For a chatbot, you want a balance: good precision so answers are reliable, and good recall so it answers most questions.

What "good" vs "bad" metric values look like for this use case

Good: The AI answers 90% of questions correctly (high precision) and responds to 85% of questions asked (high recall). Response time is under 1 second.

Bad: The AI answers only 50% of questions correctly and skips many questions (low recall). Responses take over 5 seconds, frustrating users.

Metrics pitfalls
  • Accuracy paradox: If most questions are easy, a model that always answers "I don't know" might seem accurate but is useless.
  • Data leakage: Testing on questions the AI was trained on can inflate performance.
  • Overfitting: AI might memorize answers instead of understanding, failing on new questions.
  • Ignoring latency: Fast but wrong answers are worse than slower, correct ones.
Self-check question

Your GenAI model answers 98% of questions with 98% accuracy but only responds to 12% of questions asked. Is it good for production? Why or why not?

Answer: No, because the model rarely answers questions (low recall). Even if answers are mostly correct, users will be frustrated by many unanswered queries.

Key Result
For first-time GenAI API use, balance response relevance (precision) and coverage (recall) with fast response times for best user experience.

Practice

(1/5)
1. What is the main purpose of a GenAI API when you first interact with it?
easy
A. To train a new AI model from scratch
B. To store large datasets for AI training
C. To manually code AI algorithms
D. To send a prompt and receive a text response from the AI model

Solution

  1. Step 1: Understand what GenAI APIs do

    GenAI APIs let you send a prompt (a question or task) to an AI model.
  2. Step 2: Identify the response from the API

    The API returns a text response generated by the AI based on your prompt.
  3. Final Answer:

    To send a prompt and receive a text response from the AI model -> Option D
  4. Quick Check:

    GenAI API = prompt in, text out [OK]
Hint: GenAI APIs take your question and give text answers [OK]
Common Mistakes:
  • Thinking you train the AI on first use
  • Believing you write AI code manually
  • Confusing API with data storage
2. Which of the following is the correct way to send a prompt to a GenAI API in Python?
easy
A. response = genai.ask(prompt)
B. response = genai.ask('Hello AI!')
C. response = genai.ask(prompt='Hello AI!')
D. response = genai.ask(input='Hello AI!')

Solution

  1. Step 1: Check the correct parameter name for prompt

    The GenAI API expects the prompt to be passed with the keyword 'prompt'.
  2. Step 2: Verify the syntax for calling the API

    Using named argument prompt='Hello AI!' matches the expected syntax.
  3. Final Answer:

    response = genai.ask(prompt='Hello AI!') -> Option C
  4. Quick Check:

    Use prompt= keyword to send text [OK]
Hint: Use prompt='text' when calling genai.ask() [OK]
Common Mistakes:
  • Omitting the prompt= keyword
  • Using wrong parameter name like input=
  • Passing variable without quotes when string needed
3. Given the code below, what will be printed?
response = genai.ask(prompt='What is 2 + 2?')
print(response.text)
medium
A. '4'
B. 'What is 2 + 2?'
C. An error because response has no attribute text
D. '2 + 2 equals 4'

Solution

  1. Step 1: Understand the prompt sent to the AI

    The prompt asks the AI a simple math question: 'What is 2 + 2?'.
  2. Step 2: Predict the AI's text response

    The AI will respond with the answer '4' as text, accessible via response.text.
  3. Final Answer:

    '4' -> Option A
  4. Quick Check:

    Simple math prompt returns answer text [OK]
Hint: AI answers math questions with the result as text [OK]
Common Mistakes:
  • Expecting the prompt text to be printed
  • Assuming response.text does not exist
  • Thinking AI returns full sentence instead of just answer
4. You wrote this code but get an error:
response = genai.ask('Hello AI!')
print(response.text)
What is the likely cause?
medium
A. The print statement is incorrect
B. The prompt argument is missing its keyword name
C. The genai.ask function does not exist
D. response.text is not a valid attribute

Solution

  1. Step 1: Check how the prompt is passed to genai.ask()

    The code passes 'Hello AI!' without specifying prompt= keyword.
  2. Step 2: Understand the API expects prompt= keyword

    Without prompt=, the function may raise an error or not recognize the input.
  3. Final Answer:

    The prompt argument is missing its keyword name -> Option B
  4. Quick Check:

    Always use prompt= when calling genai.ask() [OK]
Hint: Always name the prompt argument: prompt='text' [OK]
Common Mistakes:
  • Passing prompt as positional argument
  • Assuming print statement causes error
  • Thinking response.text is invalid
5. You want to use a GenAI API to get a short story about a cat. Which approach is best to get a clear, useful response?
hard
A. Send prompt='Write a short story about a cat in 3 sentences.'
B. Send prompt='cat story'
C. Send prompt='Tell me something interesting.'
D. Send prompt='Write a story'

Solution

  1. Step 1: Identify the prompt that clearly states the task

    Send prompt='Write a short story about a cat in 3 sentences.' specifies the topic (cat), the type (short story), and length (3 sentences).
  2. Step 2: Compare other prompts for clarity

    Options B, C, and D are vague and may produce unrelated or too long responses.
  3. Final Answer:

    Send prompt='Write a short story about a cat in 3 sentences.' -> Option A
  4. Quick Check:

    Clear, detailed prompts get better AI answers [OK]
Hint: Be specific and clear in your prompt for best results [OK]
Common Mistakes:
  • Using too short or vague prompts
  • Not specifying length or topic clearly
  • Expecting AI to guess details