0
0
Apache Sparkdata~20 mins

Accumulator variables in Apache Spark - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Accumulator Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of Spark accumulator after RDD operations

Consider the following Spark code that uses an accumulator to count even numbers in an RDD.

val accum = sc.longAccumulator("evenCount")
val rdd = sc.parallelize(1 to 5)
rdd.foreach(x => if (x % 2 == 0) accum.add(1))
println(accum.value)

What will be printed?

Apache Spark
val accum = sc.longAccumulator("evenCount")
val rdd = sc.parallelize(1 to 5)
rdd.foreach(x => if (x % 2 == 0) accum.add(1))
println(accum.value)
A3
B5
C0
D2
Attempts:
2 left
💡 Hint

Count how many numbers between 1 and 5 are even.

data_output
intermediate
2:00remaining
Accumulator value after multiple actions

Given this Spark code snippet:

val accum = sc.longAccumulator("sumAccumulator")
val rdd = sc.parallelize(Seq(1, 2, 3))
rdd.foreach(x => accum.add(x))
rdd.foreach(x => accum.add(x * 2))
println(accum.value)

What is the value printed?

Apache Spark
val accum = sc.longAccumulator("sumAccumulator")
val rdd = sc.parallelize(Seq(1, 2, 3))
rdd.foreach(x => accum.add(x))
rdd.foreach(x => accum.add(x * 2))
println(accum.value)
A12
B18
C21
D9
Attempts:
2 left
💡 Hint

Sum all elements, then sum all elements multiplied by 2.

🔧 Debug
advanced
2:00remaining
Identify the error in accumulator usage

What error will this Spark code produce?

val accum = sc.longAccumulator("accum")
val rdd = sc.parallelize(1 to 3)
val result = rdd.map(x => accum.add(x)).collect()
println(accum.value)
Apache Spark
val accum = sc.longAccumulator("accum")
val rdd = sc.parallelize(1 to 3)
val result = rdd.map(x => accum.add(x)).collect()
println(accum.value)
A6
BCompilation error
C0
DRuntime error: accumulator cannot be used in map
Attempts:
2 left
💡 Hint

Consider how accumulators behave in transformations and actions.

🧠 Conceptual
advanced
2:00remaining
Why accumulators are not reliable for transformations

Why should Spark accumulators not be used to update variables inside transformations like map or filter for logic that affects output?

ABecause transformations are lazy and may be recomputed, causing accumulator updates to be counted multiple times
BBecause accumulators only work on the driver node and not on executors
CBecause accumulators cannot be created inside transformations
DBecause accumulators reset automatically after each transformation
Attempts:
2 left
💡 Hint

Think about Spark's lazy evaluation and task retries.

🚀 Application
expert
3:00remaining
Using accumulators to count errors in a Spark job

You want to count how many lines in a text file contain the word "error" using Spark accumulators. Which code snippet correctly counts the occurrences?

val errorCount = sc.longAccumulator("errorCount")
val lines = sc.textFile("log.txt")
// Which option correctly updates errorCount?
Apache Spark
val errorCount = sc.longAccumulator("errorCount")
val lines = sc.textFile("log.txt")
A
val errors = lines.filter(line => line.contains("error"))
errors.foreach(_ => errorCount.add(1))
println(errorCount.value)
B
val errors = lines.filter(line => { if (line.contains("error")) errorCount.add(1); true })
errors.count()
println(errorCount.value)
C
lines.foreach(line => if (line.contains("error")) errorCount.add(1))
println(errorCount.value)
D
val errors = lines.map(line => if (line.contains("error")) errorCount.add(1))
println(errorCount.value)
Attempts:
2 left
💡 Hint

Remember that accumulators update reliably only in actions, not transformations.