Prompt Engineering / GenAIml~12 mins

Hybrid search (semantic + keyword) in Prompt Engineering / GenAI - Model Pipeline Trace

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Model Pipeline - Hybrid search (semantic + keyword)

This hybrid search pipeline combines keyword matching with semantic understanding to find the most relevant documents. It first filters documents by keywords, then ranks them by semantic similarity using a trained model.

Data Flow - 5 Stages

1Input Query

1 query string→User inputs a search query→1 query string

"best Italian restaurants near me"

↓

2Keyword Filtering

N documents x text→Filter documents containing query keywords→M documents x text (M ≤ N)

From 1000 docs, filter to 150 docs containing words like 'Italian', 'restaurants'

↓

3Semantic Embedding

1 query string and M documents x text→Convert query and documents to vector embeddings→1 query vector (768 dims), M document vectors (768 dims each)

Query vector: [0.12, -0.05, ..., 0.33], Document vector: [0.10, -0.02, ..., 0.30]

↓

4Similarity Scoring

1 query vector, M document vectors→Calculate cosine similarity between query and each document vector→M similarity scores (float between -1 and 1)

[0.85, 0.78, 0.65, ...]

↓

5Ranking and Output

M documents with similarity scores→Sort documents by similarity score descending→Top K documents ranked

Top 5 documents with scores: [(doc23, 0.85), (doc7, 0.83), ...]

Training Trace - Epoch by Epoch


Loss
0.7 |****
0.6 |*** 
0.5 |**  
0.4 |*   
0.3 |*   
0.2 |    
     1 2 3 4 5 Epochs

Epoch	Loss ↓	Accuracy ↑	Observation
1	0.65	0.60	Model starts learning semantic relations, initial moderate accuracy
2	0.48	0.72	Loss decreases, accuracy improves as embeddings better capture meaning
3	0.35	0.81	Model converges, semantic similarity scores become more reliable
4	0.30	0.85	Fine tuning improves ranking quality, loss stabilizes
5	0.28	0.87	Final epoch shows best balance of loss and accuracy

Prediction Trace - 4 Layers

Layer 1: Keyword Filtering

Layer 2: Semantic Embedding

Layer 3: Similarity Scoring

Layer 4: Ranking

Model Quiz - 3 Questions

Test your understanding

What is the main purpose of the keyword filtering stage?

ATo reduce the number of documents before semantic comparison

BTo convert text into vectors

CTo calculate similarity scores

DTo rank documents by relevance

Key Insight

Combining keyword filtering with semantic similarity balances speed and understanding, enabling efficient and meaningful search results.

Practice

(1/5)

1. What is the main advantage of hybrid search combining semantic and keyword methods?

easy

A. It improves search relevance by using both exact words and meaning.

B. It only uses exact keyword matching for faster results.

C. It ignores word meanings to focus on keyword frequency.

D. It replaces keywords with random words for variety.

Hybrid search (semantic + keyword) in Prompt Engineering / GenAI - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand keyword and semantic search roles

Step 2: Combine both for better results

Final Answer:

Quick Check:

Solution

Step 1: Understand score combination methods

Step 2: Choose addition for hybrid scoring

Final Answer:

Quick Check:

Solution

Step 1: Add corresponding semantic and keyword scores

Step 2: Create list of summed scores

Final Answer:

Quick Check:

Solution

Step 1: Check list lengths

Step 2: Understand zip behavior

Final Answer:

Quick Check:

Solution

Step 1: Identify weighting requirement

Step 2: Apply weights in formula

Final Answer:

Quick Check: