Hadoop - Modern Data Architecture with Hadoop
Consider this simplified streaming code snippet using Kafka and Spark in a Kappa architecture:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('KappaExample').getOrCreate()
# Read stream from Kafka topic
stream_df = spark.readStream.format('kafka') \
.option('kafka.bootstrap.servers', 'localhost:9092') \
.option('subscribe', 'events') \
.load()
# Select and cast value to string
value_df = stream_df.selectExpr('CAST(value AS STRING)')
# Write stream to console
query = value_df.writeStream.outputMode('append').format('console').start()
query.awaitTermination()What will this code do?
