Bird
Raised Fist0
Prompt Engineering / GenAIml~10 mins

Embedding generation in Prompt Engineering / GenAI - Interactive Code Practice

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to generate an embedding vector from text using a model.

Prompt Engineering / GenAI
embedding = model.[1](text)
Drag options to blanks, or click blank then click option'
Afit
Btrain
Cpredict
Dencode
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'train' or 'fit' instead of 'encode' to get embeddings.
Using 'predict' which is for classification or regression outputs.
2fill in blank
medium

Complete the code to normalize the embedding vector to unit length.

Prompt Engineering / GenAI
normalized_embedding = embedding / [1](embedding)
Drag options to blanks, or click blank then click option'
Asum
Blen
Cnp.linalg.norm
Dmax
Attempts:
3 left
💡 Hint
Common Mistakes
Using sum or max which do not compute vector length.
Using len which returns number of elements, not magnitude.
3fill in blank
hard

Fix the error in the code to generate embeddings for a list of texts.

Prompt Engineering / GenAI
embeddings = [model.[1](text) for text in texts]
Drag options to blanks, or click blank then click option'
Aencode
Btrain
Cfit
Dpredict
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'train' or 'fit' which are for model training, not embedding generation.
Using 'predict' which is for output predictions, not embeddings.
4fill in blank
hard

Fill both blanks to create a dictionary of text to embedding length for texts longer than 5 characters.

Prompt Engineering / GenAI
embedding_lengths = {text: len(model.[1](text)) for text in texts if len(text) [2] 5}
Drag options to blanks, or click blank then click option'
Aencode
B>
C<
Dpredict
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'predict' instead of 'encode' for embeddings.
Using '<' instead of '>' which filters shorter texts.
5fill in blank
hard

Fill all three blanks to compute cosine similarity between two normalized embeddings.

Prompt Engineering / GenAI
cos_sim = np.dot([1], [2]) / (np.linalg.norm([3]) * np.linalg.norm([2]))
Drag options to blanks, or click blank then click option'
Aembedding1
Bembedding2
Dembedding3
Attempts:
3 left
💡 Hint
Common Mistakes
Using a different vector like embedding3 which is undefined.
Mixing up the order of embeddings in dot product or norms.

Practice

(1/5)
1. What is the main purpose of embedding generation in AI?
easy
A. To convert text or items into number vectors for easier comparison
B. To translate text from one language to another
C. To generate random numbers for encryption
D. To create images from text descriptions

Solution

  1. Step 1: Understand embedding generation

    Embedding generation transforms text or items into number vectors that computers can process.
  2. Step 2: Identify the main purpose

    This transformation helps in comparing meanings and finding similarities between data.
  3. Final Answer:

    To convert text or items into number vectors for easier comparison -> Option A
  4. Quick Check:

    Embedding = number vectors [OK]
Hint: Embeddings turn words into numbers for comparison [OK]
Common Mistakes:
  • Confusing embeddings with translation
  • Thinking embeddings generate images
  • Believing embeddings create random numbers
2. Which of the following is the correct way to represent an embedding vector in Python?
easy
A. embedding = {0.1, 0.5, 0.3, 0.9}
B. embedding = '0.1, 0.5, 0.3, 0.9'
C. embedding = [0.1, 0.5, 0.3, 0.9]
D. embedding = (0.1 0.5 0.3 0.9)

Solution

  1. Step 1: Identify valid Python data structures for vectors

    Embedding vectors are usually lists or arrays of numbers in Python.
  2. Step 2: Check each option

    embedding = [0.1, 0.5, 0.3, 0.9] uses a list with commas, which is correct. embedding = '0.1, 0.5, 0.3, 0.9' is a string, C is a set (unordered), and D has invalid syntax.
  3. Final Answer:

    embedding = [0.1, 0.5, 0.3, 0.9] -> Option C
  4. Quick Check:

    Embedding vector = list of numbers [OK]
Hint: Embedding vectors are lists of numbers in Python [OK]
Common Mistakes:
  • Using strings instead of lists
  • Using sets which are unordered
  • Incorrect tuple syntax without commas
3. Given the following code snippet, what will be the output?
import numpy as np
text_embedding = np.array([0.2, 0.4, 0.6])
query_embedding = np.array([0.1, 0.3, 0.5])
similarity = np.dot(text_embedding, query_embedding)
print(round(similarity, 2))
medium
A. 0.44
B. 0.28
C. 0.36
D. 0.52

Solution

  1. Step 1: Calculate the dot product of the two vectors

    Dot product = (0.2*0.1) + (0.4*0.3) + (0.6*0.5) = 0.02 + 0.12 + 0.30 = 0.44
  2. Step 2: Round the result to 2 decimal places

    Rounded value = 0.44
  3. Final Answer:

    0.44 -> Option A
  4. Quick Check:

    Dot product = 0.44 [OK]
Hint: Dot product sums element-wise products [OK]
Common Mistakes:
  • Multiplying vectors element-wise without summing
  • Rounding before summing
  • Confusing dot product with vector length
4. The following code is intended to compute cosine similarity between two embeddings but has an error. What is the error?
import numpy as np
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

vec1 = np.array([1, 0, 0])
vec2 = np.array([0, 1, 0])
print(cosine_similarity(vec1, vec2))
medium
A. Division by zero error when vectors are zero
B. No error; code works correctly
C. Using lists instead of numpy arrays
D. Incorrect use of np.dot instead of np.cross

Solution

  1. Step 1: Analyze the cosine similarity function

    The function correctly computes dot product divided by product of norms.
  2. Step 2: Check the example vectors and output

    Vectors are numpy arrays and non-zero, so no division by zero occurs. The code runs correctly and prints 0.0.
  3. Final Answer:

    No error; code works correctly -> Option B
  4. Quick Check:

    Cosine similarity code = correct [OK]
Hint: Check for zero vectors to avoid division errors [OK]
Common Mistakes:
  • Confusing dot product with cross product
  • Forgetting to use numpy arrays
  • Not handling zero vectors causing division errors
5. You have a list of product descriptions and want to group similar products using embeddings. Which approach best helps you achieve this?
hard
A. Manually read and group descriptions without embeddings
B. Translate descriptions to another language before clustering
C. Use embeddings only for images, not text
D. Generate embeddings for each description, then use clustering on these vectors

Solution

  1. Step 1: Understand the goal of grouping similar products

    Grouping similar products means finding which descriptions are close in meaning.
  2. Step 2: Use embeddings and clustering

    Generating embeddings converts descriptions into vectors. Clustering groups vectors close in space, thus grouping similar products.
  3. Final Answer:

    Generate embeddings for each description, then use clustering on these vectors -> Option D
  4. Quick Check:

    Embedding + clustering = grouping similar items [OK]
Hint: Cluster embedding vectors to group similar items [OK]
Common Mistakes:
  • Thinking translation helps grouping
  • Assuming embeddings only work for images
  • Ignoring embeddings and grouping manually