Recall & Review

beginner

What is similarity search in machine learning?

Similarity search is a method to find items that are most alike a given item, based on some measure of closeness or resemblance.

Click to reveal answer

beginner

Name a common way to measure similarity between two data points.

Cosine similarity is a common measure that calculates the angle between two vectors to determine how similar they are.

Click to reveal answer

beginner

Why is vector representation important in similarity search?

Vector representation converts data into numbers so computers can measure similarity using math, like distances or angles between vectors.

Click to reveal answer

intermediate

What is the role of an index in similarity search and retrieval?

An index organizes data vectors so the system can quickly find the most similar items without checking every single one.

Click to reveal answer

intermediate

Explain the difference between exact and approximate similarity search.

Exact search finds the perfect closest matches but can be slow for big data. Approximate search finds close matches faster but might miss the very best ones.

Click to reveal answer

Which similarity measure calculates the angle between two vectors?

AManhattan distance

BEuclidean distance

CJaccard index

DCosine similarity

What is the main purpose of an index in similarity search?

ATo train machine learning models

BTo store raw data

CTo speed up finding similar items

DTo visualize data

Which of these is a drawback of exact similarity search?

AIt cannot handle vectors

BIt can be slow on large datasets

CIt uses approximate results

DIt is inaccurate

Vector representation is important because:

AIt allows mathematical comparison of data

BIt stores data as text

CIt removes the need for similarity measures

DIt visualizes data

Which similarity measure is best for comparing sets of items?

AJaccard index

BCosine similarity

CEuclidean distance

DPearson correlation

Describe how similarity search works and why it is useful in real life.

Explain the difference between exact and approximate similarity search and when you might use each.

Practice

(1/5)

What is the main goal of similarity search in machine learning?

easy

A. To count the number of items in a dataset

B. To sort items alphabetically

C. To find items that are close or alike in a collection

D. To remove duplicate items from a list

You have a collection of text documents converted into vectors. You want to find the top 2 most similar documents to a new query vector using cosine similarity. Which approach is best?

Compute cosine similarity between query and each document vector.
Sort documents by similarity score descending.
Return top 2 documents.

Which code snippet correctly implements this?

import numpy as np

docs = [np.array([1, 0]), np.array([0, 1]), np.array([1, 1])]
query = np.array([1, 0])

# Choose the correct code:

hard

A. scores = [np.dot(query, d) * np.linalg.norm(query) * np.linalg.norm(d) for d in docs] top2 = sorted(scores)[:2] print(top2)

B. scores = [np.dot(query, d) / (np.linalg.norm(query) * np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2] print(top2)

C. scores = [np.dot(query, d) / (np.linalg.norm(query) - np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i])[:2] print(top2)

D. scores = [np.cross(query, d) / (np.linalg.norm(query) * np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2] print(top2)

Similarity search and retrieval in Prompt Engineering / GenAI - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of similarity search

Step 2: Compare options with the definition

Final Answer:

Quick Check:

Solution

Step 1: Recall cosine similarity formula

Step 2: Match formula to code options

Final Answer:

Quick Check:

Solution

Step 1: Calculate dot product of vec1 and vec2

Step 2: Calculate norms and cosine similarity

Final Answer:

Quick Check:

Solution

Step 1: Analyze the cosine similarity formula in code

Step 2: Identify missing parentheses

Final Answer:

Quick Check:

Solution

Step 1: Compute cosine similarity correctly

Step 2: Sort indices by similarity descending and select top 2

Final Answer:

Quick Check: