0
0
Apache Sparkdata~5 mins

Why Spark replaced MapReduce for big data in Apache Spark - Quick Recap

Choose your learning style9 modes available
Recall & Review
beginner
What is the main limitation of MapReduce that Spark addresses?
MapReduce writes data to disk after each step, which slows down processing. Spark keeps data in memory, making it much faster.
Click to reveal answer
beginner
How does Spark improve speed compared to MapReduce?
Spark uses in-memory computing, which means it stores data in RAM during processing instead of writing to disk after every step.
Click to reveal answer
intermediate
What kind of tasks is Spark better suited for compared to MapReduce?
Spark is better for iterative tasks like machine learning and interactive data analysis because it can reuse data in memory.
Click to reveal answer
beginner
Why is Spark considered more user-friendly than MapReduce?
Spark provides easy-to-use APIs in multiple languages like Python, Java, and Scala, while MapReduce requires writing complex Java code.
Click to reveal answer
intermediate
What is a key architectural difference between Spark and MapReduce?
MapReduce follows a strict two-stage process (map and reduce) with disk writes in between, while Spark uses a directed acyclic graph (DAG) to optimize execution and reduce disk I/O.
Click to reveal answer
What does Spark use to speed up data processing compared to MapReduce?
AIn-memory computing
BMore disk writes
CSlower network
DSingle-threaded processing
Which programming languages does Spark support with easy APIs?
AHTML and CSS
BOnly Java
CC++ and Ruby
DPython, Java, Scala
Why is Spark better for machine learning tasks than MapReduce?
AIt writes data to disk more often
BIt can reuse data in memory for multiple iterations
CIt uses only one CPU core
DIt does not support iterative tasks
What is a major drawback of MapReduce compared to Spark?
AIt uses a DAG for execution
BIt processes data only in memory
CIt writes intermediate data to disk after each step
DIt supports too many programming languages
What execution model does Spark use to optimize tasks?
ADirected Acyclic Graph (DAG)
BLinear pipeline
CSingle map-reduce step
DRandom execution
Explain why Spark replaced MapReduce for big data processing.
Think about speed, usability, and how data is handled during processing.
You got /4 concepts.
    Describe the architectural differences between Spark and MapReduce.
    Focus on how tasks are planned and how data moves during processing.
    You got /4 concepts.