0
0
Apache Sparkdata~20 mins

Transformations vs actions in Apache Spark - Practice Questions

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Spark Transformations & Actions Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Understanding lazy evaluation in Spark
What will be the output of this Spark code snippet?
Apache Spark
val rdd = sc.parallelize(Seq(1, 2, 3, 4))
val mapped = rdd.map(x => x * 2)
// No action called yet
mapped.count()
A4
B8
CNo output, code runs lazily without action
DThrows an error because no action is called
Attempts:
2 left
💡 Hint

Remember that transformations are lazy and actions trigger computation.

data_output
intermediate
2:00remaining
Result of chained transformations and action
Given this Spark code, what is the output of the collect() action?
Apache Spark
val rdd = sc.parallelize(Seq(1, 2, 3, 4, 5))
val filtered = rdd.filter(_ % 2 == 0)
val mapped = filtered.map(_ * 10)
mapped.collect()
A[10, 20]
B[2, 4]
C[1, 3, 5]
D[20, 40]
Attempts:
2 left
💡 Hint

Filter keeps even numbers, then map multiplies by 10.

🔧 Debug
advanced
2:00remaining
Why does this Spark job not run?
Consider this Spark code snippet. Why does it not produce any output or run any job?
Apache Spark
val rdd = sc.parallelize(Seq(1, 2, 3))
rdd.map(_ * 2)
ABecause <code>map</code> is a transformation and no action is called to trigger execution
BBecause <code>map</code> is an action and should run immediately
CBecause the RDD is empty
DBecause SparkContext is not initialized
Attempts:
2 left
💡 Hint

Think about what triggers Spark jobs to run.

🧠 Conceptual
advanced
2:00remaining
Difference between transformations and actions
Which statement correctly describes the difference between transformations and actions in Spark?
ATransformations immediately compute results; actions only define computations
BActions are lazy and transformations trigger execution
CTransformations define a new RDD without executing; actions trigger execution and return results
DBoth transformations and actions trigger execution immediately
Attempts:
2 left
💡 Hint

Recall which operations cause Spark to run jobs.

🚀 Application
expert
3:00remaining
Optimizing Spark job with transformations and actions
You have this Spark code. Which option will minimize the number of Spark jobs executed?
Apache Spark
val rdd = sc.parallelize(Seq(1, 2, 3, 4, 5))
val mapped = rdd.map(_ * 2)
val filtered = mapped.filter(_ > 5)
val countResult = filtered.count()
val collectResult = filtered.collect()
ACall count and collect without caching; Spark will optimize automatically
BCache the filtered RDD before calling count and collect
CCall collect first, then count; order does not matter
DUse two separate RDDs for count and collect to avoid recomputation
Attempts:
2 left
💡 Hint

Think about how caching affects repeated actions on the same RDD.