Bird
0
0

Consider this simplified streaming code snippet using Kafka and Spark in a Kappa architecture:

medium📝 Predict Output Q13 of 15
Hadoop - Modern Data Architecture with Hadoop

Consider this simplified streaming code snippet using Kafka and Spark in a Kappa architecture:

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('KappaExample').getOrCreate()

# Read stream from Kafka topic
stream_df = spark.readStream.format('kafka') \
  .option('kafka.bootstrap.servers', 'localhost:9092') \
  .option('subscribe', 'events') \
  .load()

# Select and cast value to string
value_df = stream_df.selectExpr('CAST(value AS STRING)')

# Write stream to console
query = value_df.writeStream.outputMode('append').format('console').start()

query.awaitTermination()

What will this code do?

AContinuously print new Kafka messages from 'events' topic to console
BRead all past Kafka messages once and stop
CWrite data back to Kafka topic 'events'
DThrow an error due to missing schema
Step-by-Step Solution
Solution:
  1. Step 1: Understand the streaming read

    The code reads a continuous stream from Kafka topic 'events' using Spark Structured Streaming.
  2. Step 2: Analyze output mode and sink

    Output mode 'append' with console sink means new messages print continuously to console.
  3. Final Answer:

    Continuously print new Kafka messages from 'events' topic to console -> Option A
  4. Quick Check:

    Streaming read + console output = continuous print [OK]
Quick Trick: Streaming read + console output = live print [OK]
Common Mistakes:
  • Thinking it reads all past messages once
  • Assuming it writes back to Kafka
  • Expecting schema error without explicit schema

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More Hadoop Quizzes