Recall & Review
beginner
What is data parallelism in machine learning?
Data parallelism means splitting the data into smaller parts and processing each part on different machines or processors at the same time. The model is copied on each machine.
Click to reveal answer
beginner
What is model parallelism in machine learning?
Model parallelism means splitting the model itself into parts and running each part on different machines or processors. The data is shared across these parts.
Click to reveal answer
beginner
Which parallelism method copies the entire model on each device?
Data parallelism copies the entire model on each device and splits the data among them.
Click to reveal answer
intermediate
When is model parallelism preferred over data parallelism?
Model parallelism is preferred when the model is too big to fit into the memory of a single device.
Click to reveal answer
intermediate
What is a key challenge of data parallelism?
A key challenge is synchronizing the model updates across devices after processing different data parts.
Click to reveal answer
In data parallelism, what is split across devices?
✗ Incorrect
Data parallelism splits the data across devices while each device has a full copy of the model.
Which parallelism is best when the model is too large for one device?
✗ Incorrect
Model parallelism splits the model across devices, useful when the model is too big for one device.
What must happen after each device processes its data in data parallelism?
✗ Incorrect
Model updates from each device must be synchronized to keep the model consistent.
In model parallelism, what is shared across devices?
✗ Incorrect
In model parallelism, the data is shared and the model is split across devices.
Which parallelism method can cause communication overhead due to model synchronization?
✗ Incorrect
Data parallelism requires synchronization of model updates, which can cause communication overhead.
Explain the difference between data parallelism and model parallelism in simple terms.
Think about what is divided and what is copied in each method.
You got /4 concepts.
Describe a scenario where model parallelism is necessary and why data parallelism would not work well.
Consider device memory limits and model size.
You got /4 concepts.