Bird
0
0

Examine this Spark Structured Streaming code snippet in a Kappa architecture:

medium📝 Debug Q6 of 15
Hadoop - Modern Data Architecture with Hadoop

Examine this Spark Structured Streaming code snippet in a Kappa architecture:

df = spark.readStream.format("kafka")
  .option("kafka.bootstrap.servers", "localhost:9092")
  .option("subscribe", "topic1")
  .load()

result = df.selectExpr("CAST(value AS STRING) as data")
  .writeStream.format("console")
  .start()

What is the main issue with this code?

AIncorrect Kafka bootstrap server address format
BMissing checkpoint location for fault tolerance
CUsing 'console' sink is not supported in Spark Structured Streaming
DNot specifying the schema for Kafka messages
Step-by-Step Solution
Solution:
  1. Step 1: Review streaming write options

    Fault-tolerant streaming requires checkpointing.
  2. Step 2: Check code for checkpoint option

    Code lacks '.option("checkpointLocation", "path")'.
  3. Step 3: Understand consequences

    Without checkpointing, job cannot recover from failures.
  4. Final Answer:

    Missing checkpoint location for fault tolerance -> Option B
  5. Quick Check:

    Checkpointing is mandatory for reliable streaming [OK]
Quick Trick: Always set checkpointLocation in streaming writes [OK]
Common Mistakes:
  • Assuming console sink needs no checkpoint
  • Ignoring fault tolerance in streaming jobs
  • Confusing bootstrap server syntax

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More Hadoop Quizzes