Recall & Review
beginner
What is the main limitation of MapReduce that Spark addresses?
MapReduce writes data to disk after each step, which slows down processing. Spark keeps data in memory, making it much faster.
Click to reveal answer
beginner
How does Spark improve speed compared to MapReduce?
Spark uses in-memory computing, which means it stores data in RAM during processing instead of writing to disk after every step.
Click to reveal answer
intermediate
What kind of tasks is Spark better suited for compared to MapReduce?
Spark is better for iterative tasks like machine learning and interactive data analysis because it can reuse data in memory.
Click to reveal answer
beginner
Why is Spark considered more user-friendly than MapReduce?
Spark provides easy-to-use APIs in multiple languages like Python, Java, and Scala, while MapReduce requires writing complex Java code.
Click to reveal answer
intermediate
What is a key architectural difference between Spark and MapReduce?
MapReduce follows a strict two-stage process (map and reduce) with disk writes in between, while Spark uses a directed acyclic graph (DAG) to optimize execution and reduce disk I/O.
Click to reveal answer
What does Spark use to speed up data processing compared to MapReduce?
✗ Incorrect
Spark speeds up processing by keeping data in memory, avoiding slow disk writes after each step.
Which programming languages does Spark support with easy APIs?
✗ Incorrect
Spark offers APIs in Python, Java, and Scala, making it accessible to many developers.
Why is Spark better for machine learning tasks than MapReduce?
✗ Incorrect
Machine learning often requires repeating calculations; Spark keeps data in memory to speed this up.
What is a major drawback of MapReduce compared to Spark?
✗ Incorrect
MapReduce writes data to disk after each map and reduce step, which slows down processing.
What execution model does Spark use to optimize tasks?
✗ Incorrect
Spark uses a DAG to plan and optimize task execution, reducing unnecessary disk I/O.
Explain why Spark replaced MapReduce for big data processing.
Think about speed, usability, and how data is handled during processing.
You got /4 concepts.
Describe the architectural differences between Spark and MapReduce.
Focus on how tasks are planned and how data moves during processing.
You got /4 concepts.