Challenge - 5 Problems
LLM Scaling Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Understanding the relationship between model size and performance
Which statement best describes the general trend observed in LLM scaling laws regarding model size and performance?
Attempts:
2 left
💡 Hint
Think about how small increases in size can lead to significant improvements, but not in a simple linear way.
✗ Incorrect
LLM scaling laws show that performance improves following a power-law relationship with model size, meaning gains are significant but not linear or logarithmic.
❓ Metrics
intermediate2:00remaining
Evaluating loss behavior with increased compute
According to LLM scaling laws, how does the training loss typically change as the amount of compute used for training increases?
Attempts:
2 left
💡 Hint
Consider how more compute allows better fitting but with diminishing returns.
✗ Incorrect
Training loss decreases following a power-law as compute increases, showing diminishing returns but consistent improvement.
❓ Model Choice
advanced2:00remaining
Choosing model size for fixed compute budget
Given a fixed compute budget, which strategy aligns best with LLM scaling laws to minimize training loss?
Attempts:
2 left
💡 Hint
Think about how compute is split between model size and training duration.
✗ Incorrect
LLM scaling laws suggest an optimal balance between model size and training steps to best use compute and minimize loss.
🔧 Debug
advanced2:00remaining
Identifying incorrect interpretation of scaling laws
Which of the following interpretations of LLM scaling laws is incorrect?
Attempts:
2 left
💡 Hint
Consider if the relationship between parameters and loss is linear or not.
✗ Incorrect
Doubling parameters does not halve loss; improvements follow a power-law with diminishing returns, so this interpretation is incorrect.
❓ Predict Output
expert2:00remaining
Predicting training loss from scaling law formula
Given the scaling law formula for training loss:
loss = a * (N)^-b + c where N is the number of parameters, a=10, b=1/3, and c=0.1. What is the training loss when N=1000000?Prompt Engineering / GenAI
a = 10 b = 1/3 c = 0.1 N = 1000000 loss = a * (N)**(-b) + c print(round(loss, 4))
Attempts:
2 left
💡 Hint
Calculate N to the power of -b first, then multiply by a and add c.
✗ Incorrect
1000000 = 10^6, so (10^6)^{-1/3} = 10^{-6/3} = 10^{-2} = 0.01 exactly. Then 10 * 0.01 = 0.1, plus c = 0.1 gives loss = 0.2 exactly. round(0.2, 4) outputs 0.2000, corresponding to 0.2.