Content filtering models decide if content is safe or harmful. The key metrics are Precision and Recall. Precision tells us how many flagged contents are truly harmful. Recall tells us how many harmful contents were caught. High recall is important to catch all bad content, but high precision avoids wrongly blocking good content. Balancing both is critical.
Content filtering in Prompt Engineering / GenAI - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
| Predicted Harmful | Predicted Safe |
|-------------------|----------------|
| True Positive (TP) | False Positive (FP) |
| False Negative (FN)| True Negative (TN) |
Example:
TP = 80 (harmful content caught)
FP = 20 (safe content wrongly blocked)
TN = 900 (safe content allowed)
FN = 10 (harmful content missed)
Total samples = 80 + 20 + 900 + 10 = 1010
If the model blocks too much content (high recall), it may block good posts (low precision). This annoys users. If it blocks too little (high precision), harmful content slips through (low recall). For example, a social media platform wants to catch all hate speech (high recall) but also avoid blocking normal posts (high precision). The right balance depends on the platform's goals.
Good: Precision around 0.9 and Recall around 0.85 means most harmful content is caught and few good posts are blocked.
Bad: Precision 0.5 and Recall 0.95 means many good posts are wrongly blocked. Or Precision 0.95 and Recall 0.4 means many harmful posts are missed.
- Accuracy paradox: If harmful content is rare, a model that always predicts safe can have high accuracy but is useless.
- Data leakage: If test data leaks info from training, metrics look better than real.
- Overfitting: Very high training metrics but poor real-world performance.
Your content filter model has 98% accuracy but only 12% recall on harmful content. Is it good for production? Why or why not?
Answer: No, it is not good. The model misses 88% of harmful content (low recall), which is dangerous. High accuracy is misleading because harmful content is rare. Improving recall is critical.
Practice
Solution
Step 1: Understand content filtering purpose
Content filtering is designed to detect and remove harmful or unsafe text to protect users.Step 2: Compare options to purpose
Only To block or clean harmful text to keep users safe matches this goal; others relate to unrelated AI tasks.Final Answer:
To block or clean harmful text to keep users safe -> Option AQuick Check:
Content filtering = block harmful text [OK]
- Confusing filtering with training speed
- Thinking filtering improves image accuracy
- Assuming filtering increases data size
Solution
Step 1: Recall Python syntax for substring check
In Python, the correct way to check if a substring is in a string is usingin.Step 2: Evaluate each option
if banned_word in text: uses correct syntax; others use invalid or non-Python methods.Final Answer:
if banned_word in text: -> Option CQuick Check:
Substring check in Python uses 'in' keyword [OK]
- Using non-existent methods like contains()
- Using wrong keywords like 'inside'
- Confusing syntax from other languages
bad_words = ['spam', 'scam'] text = 'This message contains spam and scam.' filtered = any(word in text for word in bad_words) print(filtered)
Solution
Step 1: Understand the any() function with generator
The expression checks if any bad word is found in the text. Since 'spam' and 'scam' are both in the text, any() returns True.Step 2: Confirm print output
Printing filtered will output True because the condition is met.Final Answer:
True -> Option DQuick Check:
any() finds bad words = True [OK]
- Thinking any() returns False if multiple matches
- Confusing any() with all()
- Expecting an error due to syntax
bad_words = ['bad', 'ugly']
text = 'This is a bad example.'
if bad_words in text:
print('Filtered')
else:
print('Clean')Solution
Step 1: Analyze the 'if' condition
The code tries to check if a list is in a string, which is invalid in Python.Step 2: Correct way to check bad words in text
We should check each word individually, e.g., using any(word in text for word in bad_words).Final Answer:
Using 'in' to check list in string is incorrect -> Option AQuick Check:
Cannot check list in string directly [OK]
- Trying to use 'in' with list and string directly
- Ignoring need for loop or any()
- Assuming list membership works on strings
banned = ['bad', 'ugly'] and string msg = 'This is a bad and ugly day.'?Solution
Step 1: Understand string replacement for multiple words
We must replace each banned word one by one using a loop and str.replace().Step 2: Evaluate each option
for word in banned: msg = msg.replace(word, '[CENSORED]') print(msg) correctly loops and replaces; B tries to replace list directly (invalid); C uses wrong syntax; D uses filter on string (invalid).Final Answer:
for word in banned: msg = msg.replace(word, '[CENSORED]') print(msg) -> Option BQuick Check:
Loop and replace each banned word [OK]
- Trying to replace list directly in string
- Using filter on string instead of list
- Incorrect conditional replacement syntax
