Bird
Raised Fist0
Prompt Engineering / GenAIml~8 mins

Output format control in Prompt Engineering / GenAI - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Output format control
Which metric matters for Output format control and WHY

Output format control means making sure the model's answers come in the right shape and style. For example, if a model should give a list of names, it should not give a paragraph instead. The key metric here is format accuracy, which checks if the output matches the expected format exactly. This is important because even if the content is correct, a wrong format can break the next steps in a system.

Confusion matrix or equivalent visualization
Expected Format: JSON object with keys 'name' and 'age'

Model Output Format Check:

|               | Correct Content | Incorrect Content |
|---------------|-----------------|-------------------|
| Correct Format|       TP        |        FP         |
| Incorrect Format |     FN        |        TN         |

Where:
- TP: Model output matches expected format and content
- FP: Model output format is correct but content is wrong
- FN: Model output format is wrong but content is correct
- TN: Model output format and content both wrong

Example counts:
TP=80, FP=10, FN=5, TN=5
Total=100

Precision = TP / (TP + FP) = 80 / (80 + 10) = 0.89
Recall = TP / (TP + FN) = 80 / (80 + 5) = 0.94
F1 = 2 * (0.89 * 0.94) / (0.89 + 0.94) = 0.91
    
Precision vs Recall tradeoff with concrete examples

In output format control, precision means how often the model's output format is correct when it claims to be correct. Recall means how many of the correctly formatted outputs the model actually produces.

For example, if a chatbot must always respond in a JSON format, high precision means it rarely outputs wrong formats. High recall means it almost never misses producing the correct format when it should.

If precision is low, the system may break because wrong formats appear often. If recall is low, the system misses many chances to give the right format, causing incomplete or missing data.

What "good" vs "bad" metric values look like for Output format control

Good: Precision and recall above 90% means the model almost always outputs the right format and rarely misses it. This leads to smooth downstream processing.

Bad: Precision below 70% means many outputs have wrong formats, causing errors. Recall below 70% means many correct formats are missed, leading to incomplete results.

Metrics pitfalls
  • Ignoring content correctness: Output format control focuses on format, but content errors can still happen.
  • Overfitting to format: Model may produce correct format but nonsense content.
  • Data leakage: If training data always has perfect format, model may fail on real-world variations.
  • Accuracy paradox: High overall accuracy can hide poor format control if data is imbalanced.
Self-check question

Your model has 98% accuracy but only 12% recall on correct output format. Is it good for production? Why not?

Answer: No, it is not good. Even though accuracy is high, the model misses most of the correctly formatted outputs (low recall). This means many outputs are in wrong formats, which can break the system relying on the output format.

Key Result
Precision and recall above 90% are key to ensure model outputs the correct format reliably.

Practice

(1/5)
1. What is the main reason to control the output format of a machine learning model?
easy
A. To change the model's architecture
B. To increase the model's accuracy
C. To reduce the training time
D. To make the results easier to read and understand

Solution

  1. Step 1: Understand output format control

    Output format control is about how results are shown, not about model internals.
  2. Step 2: Identify the purpose of formatting

    Formatting helps make results clear and easy to read for users or other systems.
  3. Final Answer:

    To make the results easier to read and understand -> Option D
  4. Quick Check:

    Output format = readability [OK]
Hint: Output format helps people read results clearly [OK]
Common Mistakes:
  • Confusing output format with model accuracy
  • Thinking output format changes training speed
  • Believing output format alters model design
2. Which of the following is the correct way to format model output as a JSON string in Python?
easy
A. json.load(output)
B. json.dumps(output)
C. json.parse(output)
D. json.write(output)

Solution

  1. Step 1: Recall JSON functions in Python

    json.dumps() converts Python objects to JSON strings.
  2. Step 2: Check other options

    json.load() reads JSON from a file, json.parse() and json.write() are invalid in Python's json module.
  3. Final Answer:

    json.dumps(output) -> Option B
  4. Quick Check:

    Convert to JSON string = json.dumps() [OK]
Hint: Use json.dumps() to get JSON string from Python data [OK]
Common Mistakes:
  • Using json.load() instead of dumps()
  • Trying json.parse() which doesn't exist in Python
  • Confusing reading JSON with writing JSON
3. Given the Python code:
predictions = [0.1, 0.9, 0.8]
formatted = ', '.join(str(p) for p in predictions)
print(formatted)

What will be the output?
medium
A. 0.1, 0.9, 0.8
B. [0.1, 0.9, 0.8]
C. 0.1 0.9 0.8
D. Error: join expects a string

Solution

  1. Step 1: Understand join with generator

    Each number is converted to string, then joined with ', ' separator.
  2. Step 2: Predict printed string

    Result is '0.1, 0.9, 0.8' as a single string.
  3. Final Answer:

    0.1, 0.9, 0.8 -> Option A
  4. Quick Check:

    Join list with ', ' = '0.1, 0.9, 0.8' [OK]
Hint: join() combines strings with separator [OK]
Common Mistakes:
  • Expecting list brackets in output
  • Thinking join adds spaces only
  • Confusing join with print of list
4. The code below tries to format model predictions as a table but throws an error:
predictions = [0.2, 0.5, 0.7]
print('Index | Prediction')
for i, p in predictions:
    print(f'{i} | {p}')

What is the error and how to fix it?
medium
A. Error: 'predictions' is not iterable as (index, value); fix by using enumerate(predictions)
B. Error: f-string syntax wrong; fix by removing curly braces
C. Error: print missing parentheses; fix by adding them
D. No error; code runs fine

Solution

  1. Step 1: Identify iteration error

    Loop expects pairs (i, p), but predictions is a list of floats, not tuples.
  2. Step 2: Fix by using enumerate

    Use for i, p in enumerate(predictions) to get index and value pairs.
  3. Final Answer:

    Error: 'predictions' is not iterable as (index, value); fix by using enumerate(predictions) -> Option A
  4. Quick Check:

    Use enumerate() to get index-value pairs [OK]
Hint: Use enumerate() to loop with index and value [OK]
Common Mistakes:
  • Trying to unpack single list items as tuples
  • Ignoring need for enumerate in loops
  • Misreading f-string syntax errors
5. You want to output model predictions as a JSON object with keys as sample IDs and values as predictions. Given:
sample_ids = ['s1', 's2', 's3']
predictions = [0.3, 0.6, 0.9]

Which code correctly creates this JSON string?
hard
A. json.dumps({predictions[i]: sample_ids[i] for i in range(len(predictions))})
B. json.dumps(dict(zip(predictions, sample_ids)))
C. json.dumps({sample_ids[i]: predictions[i] for i in range(len(sample_ids))})
D. json.dumps([sample_ids, predictions])

Solution

  1. Step 1: Match keys and values correctly

    Keys should be sample_ids, values should be predictions, so use dictionary comprehension with sample_ids as keys.
  2. Step 2: Check other options

    json.dumps({predictions[i]: sample_ids[i] for i in range(len(predictions))}) and json.dumps(dict(zip(predictions, sample_ids))) reverse keys/values, json.dumps([sample_ids, predictions]) creates a list not dict.
  3. Final Answer:

    json.dumps({sample_ids[i]: predictions[i] for i in range(len(sample_ids))}) -> Option C
  4. Quick Check:

    Keys = sample_ids, values = predictions [OK]
Hint: Use dict comprehension with keys and values zipped [OK]
Common Mistakes:
  • Swapping keys and values in dict
  • Using list instead of dict for JSON object
  • Forgetting to convert dict to JSON string