Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a similarity-based retrieval strategy?
A similarity-based retrieval strategy finds items that are most alike to a query by measuring how close or similar their features are. It often uses metrics like cosine similarity or Euclidean distance to rank results.
Click to reveal answer
intermediate
Explain Maximal Marginal Relevance (MMR) in retrieval.
MMR balances relevance and diversity by selecting items that are both similar to the query and different from each other. This helps avoid redundant results and provides a varied set of answers.
Click to reveal answer
intermediate
What does a hybrid retrieval strategy combine?
A hybrid retrieval strategy combines similarity-based methods with other techniques like MMR or rule-based filters to improve both relevance and diversity of results.
Click to reveal answer
beginner
Why is diversity important in retrieval results?
Diversity ensures that retrieved results cover different aspects or perspectives, reducing repetition and increasing the chance of finding useful information.
Click to reveal answer
beginner
Name two common similarity metrics used in retrieval strategies.
Cosine similarity and Euclidean distance are two common metrics used to measure how close or similar items are in retrieval tasks.
Click to reveal answer
Which retrieval strategy explicitly balances relevance and diversity?
APure similarity-based retrieval
BMaximal Marginal Relevance (MMR)
CRandom selection
DRule-based filtering
✗ Incorrect
MMR selects items that are relevant to the query but also different from each other to ensure diversity.
What does a similarity-based retrieval strategy primarily use to rank results?
APredefined categories
BRandom chance
CUser feedback only
DSimilarity metrics like cosine similarity
✗ Incorrect
Similarity-based retrieval ranks items by how close they are to the query using similarity metrics.
A hybrid retrieval strategy might combine similarity with which other approach?
AOnly random selection
BIgnoring diversity
CMaximal Marginal Relevance (MMR)
DNo filtering
✗ Incorrect
Hybrid strategies combine similarity with methods like MMR to improve result quality.
Why avoid only similarity-based retrieval without diversity?
AResults may be repetitive and less useful
BIt is faster
CIt always gives the best results
DIt ignores relevance
✗ Incorrect
Without diversity, results can be very similar and redundant, reducing usefulness.
Which metric measures the angle between two vectors in retrieval?
ACosine similarity
BEuclidean distance
CManhattan distance
DJaccard index
✗ Incorrect
Cosine similarity measures the angle between vectors, indicating how similar they are.
Describe how Maximal Marginal Relevance (MMR) improves retrieval results compared to pure similarity-based methods.
Think about why showing only very similar items might not be helpful.
You got /3 concepts.
Explain what a hybrid retrieval strategy is and why it might be better than using only one retrieval method.
Consider mixing strengths of different approaches.
You got /4 concepts.
Practice
(1/5)
1. Which retrieval strategy focuses on ranking results purely based on how close they are to the query?
easy
A. Random retrieval
B. Maximal Marginal Relevance (MMR)
C. Similarity-based retrieval
D. Hybrid retrieval
Solution
Step 1: Understand similarity-based retrieval
Similarity-based retrieval ranks results by how close or similar they are to the query, focusing only on relevance.
Step 2: Compare with other strategies
MMR balances relevance and diversity, hybrid combines methods, and random is unrelated.
Final Answer:
Similarity-based retrieval -> Option C
Quick Check:
Similarity = closeness only [OK]
Hint: Similarity means closest match only [OK]
Common Mistakes:
Confusing MMR with similarity
Thinking hybrid is only similarity
Choosing random as a valid strategy
2. Which of the following is the correct way to describe Maximal Marginal Relevance (MMR)?
easy
A. Combines all retrieval methods without weighting
B. Ranks results by random selection
C. Only uses keyword matching
D. Balances relevance and diversity in retrieval
Solution
Step 1: Define MMR
MMR is designed to balance relevance to the query and diversity among the results to avoid redundancy.
Step 2: Eliminate incorrect options
Random selection is unrelated, keyword matching is too narrow, and combining without weighting is not MMR.
Final Answer:
Balances relevance and diversity in retrieval -> Option D
Quick Check:
MMR = relevance + diversity [OK]
Hint: MMR mixes relevance with diversity [OK]
Common Mistakes:
Thinking MMR is random
Assuming MMR uses only keywords
Believing MMR combines methods blindly
3. Given the following pseudo-code for a hybrid retrieval method combining similarity and MMR scores:
results = []
for doc in documents:
sim_score = similarity(query, doc)
mmr_score = mmr(query, doc, results)
combined_score = 0.6 * sim_score + 0.4 * mmr_score
results.append((doc, combined_score))
results.sort(key=lambda x: x[1], reverse=True)
print([doc for doc, score in results[:3]])
What does this code output?
medium
A. Top 3 documents ranked by combined similarity and MMR scores
B. Top 3 documents ranked by similarity score only
C. Top 3 documents ranked by MMR score only
D. Random 3 documents from the list
Solution
Step 1: Analyze score calculation
The code calculates a combined score using 60% similarity and 40% MMR for each document.
Step 2: Understand sorting and output
Documents are sorted by this combined score in descending order, then top 3 are printed.
Final Answer:
Top 3 documents ranked by combined similarity and MMR scores -> Option A
Quick Check:
Hybrid = combined scores [OK]
Hint: Check weighted sum and sorting for final ranking [OK]
Common Mistakes:
Ignoring MMR score in combined score
Assuming sorting by similarity only
Thinking output is random
4. Consider this buggy code snippet for MMR retrieval:
def mmr(query, docs, selected):
scores = []
for doc in docs:
relevance = similarity(query, doc)
diversity = min([similarity(doc, s) for s in selected])
score = relevance - 0.5 * diversity
scores.append((doc, score))
return max(scores, key=lambda x: x[1])[0]
What is the main error causing a crash when selected is empty?
medium
A. Using min() on an empty list causes an error
B. Incorrect use of max() function
C. Missing return statement
D. Similarity function is undefined
Solution
Step 1: Identify cause of crash
When selected is empty, the list inside min() is empty, causing a ValueError.
Step 2: Understand min() behavior
min() cannot operate on empty lists, so the code crashes at that line.
Final Answer:
Using min() on an empty list causes an error -> Option A
Quick Check:
min(empty list) = error [OK]
Hint: Check min() on empty lists for errors [OK]
Common Mistakes:
Blaming max() instead of min()
Ignoring empty list edge case
Assuming similarity is undefined
5. You want to improve a search system by combining similarity and MMR retrieval. Which approach best balances relevance and diversity in the final results?
hard
A. Use MMR with a diversity weight of zero
B. Combine similarity and MMR scores with adjustable weights
C. Use only similarity scores to rank results
D. Randomly shuffle results after similarity ranking
Solution
Step 1: Understand the goal
Balancing relevance and diversity requires combining both similarity and MMR scores meaningfully.
Step 2: Evaluate options
Using only similarity or zero diversity weight ignores diversity; random shuffling loses relevance order.
Step 3: Best approach
Combining similarity and MMR with adjustable weights allows tuning the balance effectively.
Final Answer:
Combine similarity and MMR scores with adjustable weights -> Option B
Quick Check:
Hybrid weighted combination = best balance [OK]
Hint: Adjust weights to balance relevance and diversity [OK]