0
0
NLPml~20 mins

Semantic similarity with embeddings in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
๐ŸŽ–๏ธ
Semantic Similarity Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
โ“ Predict Output
intermediate
2:00remaining
What is the cosine similarity output between two embeddings?
Given two 3-dimensional embeddings:
embedding1 = [1, 0, 1]
embedding2 = [0, 1, 1]

Calculate the cosine similarity using the formula:
cosine_similarity = (A ยท B) / (||A|| * ||B||)
NLP
import numpy as np
embedding1 = np.array([1, 0, 1])
embedding2 = np.array([0, 1, 1])
cosine_similarity = np.dot(embedding1, embedding2) / (np.linalg.norm(embedding1) * np.linalg.norm(embedding2))
print(round(cosine_similarity, 3))
A0.5
B0.707
C0.0
D1.0
Attempts:
2 left
๐Ÿ’ก Hint
Recall cosine similarity measures the angle between vectors, dot product divided by product of magnitudes.
โ“ Model Choice
intermediate
2:00remaining
Which embedding model is best for capturing sentence-level semantic similarity?
You want to compare the meaning of full sentences, not just words. Which model is most suitable?
AWord2Vec trained on individual words
BOne-hot encoding of words
CSentence-BERT (SBERT) embeddings
DGloVe embeddings for words
Attempts:
2 left
๐Ÿ’ก Hint
Consider models designed to produce embeddings for entire sentences.
โ“ Hyperparameter
advanced
2:00remaining
Which hyperparameter affects the quality of semantic similarity in embedding training?
When training a Word2Vec model, which hyperparameter most directly influences the semantic quality of embeddings?
ANumber of training epochs
BWindow size (context window)
CLearning rate decay schedule
DBatch size
Attempts:
2 left
๐Ÿ’ก Hint
Think about how much context the model sees around each word.
โ“ Metrics
advanced
2:00remaining
Which metric is best to evaluate semantic similarity between embeddings?
You have two sentence embeddings and want to measure how similar their meanings are. Which metric is most appropriate?
AEuclidean distance
BJaccard similarity
CMean squared error
DCosine similarity
Attempts:
2 left
๐Ÿ’ก Hint
Consider a metric that measures angle between vectors regardless of length.
๐Ÿ”ง Debug
expert
3:00remaining
Why does this semantic similarity code produce a runtime error?
Code snippet:
import numpy as np
embedding1 = [0.1, 0.3, 0.5]
embedding2 = [0.2, 0.4]
cos_sim = np.dot(embedding1, embedding2) / (np.linalg.norm(embedding1) * np.linalg.norm(embedding2))
print(cos_sim)

What causes the error?
NLP
import numpy as np
embedding1 = [0.1, 0.3, 0.5]
embedding2 = [0.2, 0.4]
cos_sim = np.dot(embedding1, embedding2) / (np.linalg.norm(embedding1) * np.linalg.norm(embedding2))
print(cos_sim)
AVectors have different lengths causing dot product error
Bnp.linalg.norm cannot compute norm of lists
CDivision by zero due to zero norm
Dnp.dot requires integer inputs
Attempts:
2 left
๐Ÿ’ก Hint
Check if both vectors have the same number of elements.