Bird
Raised Fist0
Prompt Engineering / GenAIml~8 mins

Hybrid search (semantic + keyword) in Prompt Engineering / GenAI - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Hybrid search (semantic + keyword)
Which metric matters for Hybrid Search and WHY

Hybrid search combines semantic understanding and keyword matching to find the best results. The key metrics are Recall and Precision. Recall shows how many relevant results the search finds, important to not miss good answers. Precision shows how many found results are actually relevant, important to avoid noise. Since hybrid search balances meaning and exact words, both metrics help check if it finds enough good matches without too many wrong ones.

Confusion Matrix for Hybrid Search Results
      |---------------------------|
      |          | Predicted      |
      | Actual   | Relevant | Not |
      |----------|----------|-----|
      | Relevant |    TP    | FN  |
      | Not Rel. |    FP    | TN  |
      |---------------------------|

      TP = Correctly found relevant results
      FP = Found results that are not relevant
      FN = Relevant results missed by search
      TN = Correctly ignored irrelevant results
    

Metrics use these counts:
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)

Precision vs Recall Tradeoff with Examples

In hybrid search, tuning for more semantic matching can increase recall by finding more relevant results even if keywords differ. But this may lower precision by including less exact matches. Tuning for strict keyword matching can increase precision by returning exact hits but lower recall by missing related results.

Example 1: A legal document search needs high precision to avoid irrelevant cases. So keyword matching is emphasized.

Example 2: A customer support search wants high recall to find all helpful answers, so semantic search is emphasized.

What Good vs Bad Metric Values Look Like

Good: Precision and recall both above 0.8 means the search finds most relevant results and keeps irrelevant ones low.

Bad: Precision below 0.5 means many irrelevant results confuse users. Recall below 0.5 means many relevant results are missed.

Balanced metrics around 0.7 are often acceptable depending on use case.

Common Metrics Pitfalls in Hybrid Search
  • Accuracy paradox: High accuracy can be misleading if most results are irrelevant and the model just returns few results.
  • Data leakage: Using test queries that appear in training can inflate metrics.
  • Overfitting: Tuning too much on keyword matching may miss semantic matches, hurting recall.
  • Ignoring user intent: Metrics alone don't capture if results satisfy user needs.
Self Check: Your model has 98% accuracy but 12% recall on relevant results. Is it good?

No, it is not good. The high accuracy likely means the model returns very few results, mostly irrelevant ones correctly ignored. But 12% recall means it misses 88% of relevant results, so users won't find what they need. Improving recall is critical for hybrid search usefulness.

Key Result
Hybrid search needs balanced precision and recall to find relevant results without too much noise.

Practice

(1/5)
1. What is the main advantage of hybrid search combining semantic and keyword methods?
easy
A. It improves search relevance by using both exact words and meaning.
B. It only uses exact keyword matching for faster results.
C. It ignores word meanings to focus on keyword frequency.
D. It replaces keywords with random words for variety.

Solution

  1. Step 1: Understand keyword and semantic search roles

    Keyword search finds exact word matches; semantic search finds meaning matches.
  2. Step 2: Combine both for better results

    Hybrid search uses both to improve relevance and user satisfaction.
  3. Final Answer:

    It improves search relevance by using both exact words and meaning. -> Option A
  4. Quick Check:

    Hybrid search = better relevance [OK]
Hint: Hybrid = exact words + meaning for best results [OK]
Common Mistakes:
  • Thinking hybrid search uses only keywords
  • Assuming semantic search ignores keywords
  • Believing hybrid search slows down search always
2. Which of the following is the correct way to combine semantic and keyword scores in hybrid search?
easy
A. final_score = semantic_score * keyword_score
B. final_score = semantic_score / keyword_score
C. final_score = semantic_score - keyword_score
D. final_score = semantic_score + keyword_score

Solution

  1. Step 1: Understand score combination methods

    Adding scores balances contributions from both semantic and keyword parts.
  2. Step 2: Choose addition for hybrid scoring

    Adding semantic and keyword scores is common to combine relevance signals.
  3. Final Answer:

    final_score = semantic_score + keyword_score -> Option D
  4. Quick Check:

    Hybrid score = sum of semantic and keyword [OK]
Hint: Add scores to combine semantic and keyword relevance [OK]
Common Mistakes:
  • Multiplying scores causing very small or large values
  • Subtracting scores losing positive relevance
  • Dividing scores causing errors if denominator is zero
3. Given the code snippet:
semantic_scores = [0.8, 0.5, 0.3]
keyword_scores = [0.6, 0.7, 0.4]
final_scores = [s + k for s, k in zip(semantic_scores, keyword_scores)]
print(final_scores)

What is the output?
medium
A. [1.4, 1.2, 0.7]
B. [0.2, -0.2, -0.1]
C. [0.48, 0.35, 0.12]
D. [1.2, 1.4, 0.7]

Solution

  1. Step 1: Add corresponding semantic and keyword scores

    0.8+0.6=1.4, 0.5+0.7=1.2, 0.3+0.4=0.7
  2. Step 2: Create list of summed scores

    final_scores = [1.4, 1.2, 0.7]
  3. Final Answer:

    [1.4, 1.2, 0.7] -> Option A
  4. Quick Check:

    Sum pairs = [1.4, 1.2, 0.7] [OK]
Hint: Add pairs element-wise for final scores [OK]
Common Mistakes:
  • Multiplying instead of adding scores
  • Mixing order of scores in zip
  • Confusing subtraction with addition
4. Identify the error in this hybrid search scoring code:
semantic_scores = [0.9, 0.4, 0.7]
keyword_scores = [0.5, 0.6]
final_scores = [s + k for s, k in zip(semantic_scores, keyword_scores)]
print(final_scores)
medium
A. Adding scores should use multiplication instead.
B. Using zip causes a syntax error here.
C. Lists have different lengths causing missing scores.
D. The print statement is missing parentheses.

Solution

  1. Step 1: Check list lengths

    semantic_scores has 3 items; keyword_scores has 2 items.
  2. Step 2: Understand zip behavior

    zip stops at shortest list length, so last semantic score is ignored.
  3. Final Answer:

    Lists have different lengths causing missing scores. -> Option C
  4. Quick Check:

    Unequal list lengths truncate results [OK]
Hint: Ensure lists are same length before zipping [OK]
Common Mistakes:
  • Assuming zip pads shorter list automatically
  • Thinking zip causes syntax error
  • Believing multiplication is required for hybrid scores
5. You want to improve a hybrid search system by weighting semantic similarity twice as much as keyword matching. Which formula correctly applies this?
hard
A. final_score = semantic_score + 2 * keyword_score
B. final_score = 2 * semantic_score + keyword_score
C. final_score = semantic_score * keyword_score * 2
D. final_score = (semantic_score + keyword_score) / 2

Solution

  1. Step 1: Identify weighting requirement

    Semantic similarity should count double compared to keyword score.
  2. Step 2: Apply weights in formula

    Multiply semantic_score by 2, then add keyword_score.
  3. Final Answer:

    final_score = 2 * semantic_score + keyword_score -> Option B
  4. Quick Check:

    Semantic weighted double = 2 * semantic + keyword [OK]
Hint: Multiply semantic score by 2 before adding keyword [OK]
Common Mistakes:
  • Weighting keyword score instead of semantic
  • Multiplying all scores together
  • Dividing sum instead of weighting