Recall & Review
beginner
What is a transformation in Apache Spark?
A transformation is an operation on an RDD or DataFrame that returns a new RDD or DataFrame. It is lazy, meaning it does not execute immediately but builds a plan for computation.
Click to reveal answer
beginner
What is an action in Apache Spark?
An action triggers the execution of the transformations and returns a result to the driver or writes data to storage. Actions cause Spark to compute the data.
Click to reveal answer
intermediate
Why are transformations called lazy in Spark?
Because Spark does not run transformations immediately. It waits until an action is called to optimize and execute the whole set of transformations together.
Click to reveal answer
beginner
Give an example of a transformation and an action in Spark.
Transformation example: map(), filter(). Action example: collect(), count().
Click to reveal answer
intermediate
How do actions affect performance in Spark?
Actions trigger computation, so they can be expensive. Minimizing actions and combining transformations helps Spark optimize execution and improve performance.
Click to reveal answer
Which of the following is a transformation in Spark?
✗ Incorrect
filter() is a transformation that returns a new RDD or DataFrame without triggering computation.
What does an action in Spark do?
✗ Incorrect
Actions trigger Spark to execute all pending transformations and return results or write data.
Why are transformations in Spark considered lazy?
✗ Incorrect
Transformations only build a computation plan and wait for an action to trigger execution.
Which of these is an action in Spark?
✗ Incorrect
reduce() is an action that aggregates data and triggers computation.
What happens if you call multiple transformations without an action?
✗ Incorrect
Transformations are lazy and do not execute until an action triggers computation.
Explain the difference between transformations and actions in Apache Spark.
Think about when Spark actually runs the code.
You got /4 concepts.
Why is lazy evaluation important in Spark's transformations?
Consider how Spark saves time and resources.
You got /4 concepts.