0
0
Prompt Engineering / GenAIml~20 mins

Embedding dimensionality considerations in Prompt Engineering / GenAI - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Embedding Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why choose higher embedding dimensions?

When creating embeddings for text data, why might you choose a higher dimensionality?

ATo capture more detailed features and subtle differences between items
BTo reduce the training time and computational cost
CBecause higher dimensions always guarantee better accuracy regardless of data
DTo make the embeddings easier to visualize in 2D or 3D plots
Attempts:
2 left
💡 Hint

Think about what more dimensions allow the model to represent.

Metrics
intermediate
2:00remaining
Effect of embedding size on cosine similarity

You have two sets of embeddings: one with 50 dimensions and one with 300 dimensions. Both sets are normalized. Which statement about cosine similarity between embeddings is true?

AHigher dimensional embeddings tend to have cosine similarities closer to zero due to sparsity
BCosine similarity values are directly comparable regardless of embedding size
CLower dimensional embeddings always produce higher cosine similarity values
DCosine similarity cannot be computed for embeddings with different dimensions
Attempts:
2 left
💡 Hint

Consider how dimensionality affects vector distribution and sparsity.

Predict Output
advanced
2:00remaining
Output shape of embedding layer

Consider the following PyTorch code snippet creating an embedding layer and passing input indices:

Prompt Engineering / GenAI
import torch
import torch.nn as nn

embedding = nn.Embedding(num_embeddings=1000, embedding_dim=128)
input_indices = torch.tensor([[1, 2, 3], [4, 5, 6]])
output = embedding(input_indices)
print(output.shape)
Atorch.Size([128, 3, 2])
Btorch.Size([3, 2, 128])
Ctorch.Size([2, 128])
Dtorch.Size([2, 3, 128])
Attempts:
2 left
💡 Hint

Think about how embedding layers map indices to vectors.

Hyperparameter
advanced
2:00remaining
Choosing embedding dimension for a small dataset

You have a small dataset with only 500 unique words. Which embedding dimension is most appropriate to avoid overfitting while maintaining useful representation?

A1024 dimensions to capture all nuances
B3000 dimensions to ensure maximum expressiveness
C50 dimensions to balance capacity and overfitting risk
D5 dimensions because the dataset is small
Attempts:
2 left
💡 Hint

Think about the trade-off between model complexity and data size.

🔧 Debug
expert
3:00remaining
Why does increasing embedding dimension cause training to fail?

After increasing embedding dimension from 100 to 10000 in a neural network, training loss becomes NaN immediately. What is the most likely cause?

AHigher dimension embeddings always cause NaN due to numerical overflow
BThe model runs out of memory causing unstable gradients
CThe optimizer does not support large embedding sizes
DEmbedding dimension does not affect training stability
Attempts:
2 left
💡 Hint

Consider hardware limits and how large parameters affect training.