What is the main responsibility of the Driver in a Spark application?
Think about which component controls the flow and task scheduling in Spark.
The Driver is the central coordinator that creates the SparkContext, converts user code into tasks, and schedules them on executors.
Which of the following best describes the role of Executors in Spark?
Executors are like workers that do the actual data processing.
Executors run the tasks assigned by the driver and hold data in memory or disk during the job execution.
What is the primary role of the Cluster Manager in Spark?
Think about who controls resource distribution for all applications running on the cluster.
The Cluster Manager allocates resources like CPU and memory to Spark applications and manages multiple applications running on the cluster.
Consider a Spark application with 3 executors, each running 4 cores. If the application runs a job with 12 tasks, how many tasks will each executor run assuming tasks are evenly distributed?
executors = 3 cores_per_executor = 4 tasks = 12 # Calculate tasks per executor assuming even distribution tasks_per_executor = tasks // executors print(tasks_per_executor)
Divide total tasks by number of executors for even distribution.
With 12 tasks and 3 executors, each executor runs 12 / 3 = 4 tasks.
What happens if the Driver fails during a Spark job execution in a cluster mode?
Consider the driver's role in task scheduling and coordination.
The driver coordinates the job execution. If it fails, executors lose instructions and the job stops.