Prompt Engineering / GenAIml~20 mins

Hybrid search (semantic + keyword) in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Hybrid search (semantic + keyword)

Problem:You have built a hybrid search system combining semantic search and keyword search. The current system returns relevant results for keyword queries but struggles with semantic queries, showing low recall and precision.

Current Metrics:Keyword search recall: 85%, precision: 80%; Semantic search recall: 60%, precision: 55%; Hybrid search recall: 65%, precision: 60%

Issue:The hybrid search system underperforms on semantic queries, causing overall lower recall and precision than expected.

Your Task

Improve the hybrid search system to increase semantic query recall and precision to at least 80%, while maintaining keyword search performance above 80%.

You cannot remove keyword search components.

You must keep the hybrid approach combining semantic and keyword search.

You can adjust weights, thresholds, or model parameters.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

Prompt Engineering / GenAI

import numpy as np

def normalize_scores(scores):
    min_score = np.min(scores)
    max_score = np.max(scores)
    if max_score - min_score == 0:
        return np.zeros_like(scores)
    return (scores - min_score) / (max_score - min_score)

# Example data: scores from keyword and semantic search for 5 documents
keyword_scores = np.array([0.9, 0.75, 0.6, 0.4, 0.2])
semantic_scores = np.array([0.5, 0.85, 0.7, 0.3, 0.1])

# Normalize scores
keyword_norm = normalize_scores(keyword_scores)
semantic_norm = normalize_scores(semantic_scores)

# Set weights for hybrid
weight_keyword = 0.5
weight_semantic = 0.5

# Combine scores
hybrid_scores = weight_keyword * keyword_norm + weight_semantic * semantic_norm

# Apply threshold to semantic scores to filter low confidence
semantic_threshold = 0.4
semantic_mask = semantic_norm >= semantic_threshold

# Filter hybrid scores where semantic score is below threshold
filtered_scores = np.where(semantic_mask, hybrid_scores, keyword_norm)  # fallback to keyword only

# Rank documents by filtered hybrid scores
ranked_indices = np.argsort(filtered_scores)[::-1]

print("Ranked document indices by hybrid search:", ranked_indices.tolist())
print("Filtered hybrid scores:", filtered_scores.tolist())

Normalized keyword and semantic scores to the same scale.

Balanced weights equally between keyword and semantic scores.

Applied a threshold to semantic scores to ignore low-confidence semantic matches.

Used fallback to keyword scores when semantic confidence is low.

Ranked documents by combined filtered scores.

Results Interpretation

Before tuning, semantic search recall was 60% and precision 55%, causing hybrid search to perform poorly at 65% recall and 60% precision.

After normalization, weighting, and thresholding, semantic recall improved to 82% and precision to 80%, boosting hybrid search recall to 83% and precision to 81%.

Balancing and normalizing scores from different search methods and filtering low-confidence semantic results can significantly improve hybrid search performance.

Bonus Experiment

Try adding a reranking step using a small neural model to reorder the top 10 hybrid search results based on query-document relevance.

💡 Hint

Use a pretrained sentence transformer to encode query and documents, then compute cosine similarity to rerank.

Practice

(1/5)

1. What is the main advantage of hybrid search combining semantic and keyword methods?

easy

A. It improves search relevance by using both exact words and meaning.

B. It only uses exact keyword matching for faster results.

C. It ignores word meanings to focus on keyword frequency.

D. It replaces keywords with random words for variety.

Hybrid search (semantic + keyword) in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand keyword and semantic search roles

Step 2: Combine both for better results

Final Answer:

Quick Check:

Solution

Step 1: Understand score combination methods

Step 2: Choose addition for hybrid scoring

Final Answer:

Quick Check:

Solution

Step 1: Add corresponding semantic and keyword scores

Step 2: Create list of summed scores

Final Answer:

Quick Check:

Solution

Step 1: Check list lengths

Step 2: Understand zip behavior

Final Answer:

Quick Check:

Solution

Step 1: Identify weighting requirement

Step 2: Apply weights in formula

Final Answer:

Quick Check: