Complete the code to read a streaming DataFrame from a socket source.
streamingDF = spark.readStream.format([1]).option("host", "localhost").option("port", 9999).load()
The format "socket" is used to read streaming data from a socket source.
Complete the code to start the streaming query that writes output to the console.
query = streamingDF.writeStream.format([1]).start()The "console" format writes streaming output to the console for easy viewing.
Fix the error in the code to specify the output mode for the streaming query.
query = streamingDF.writeStream.outputMode([1]).format("console").start()
The "append" output mode outputs only new rows added to the result table since the last trigger.
Fill both blanks to create a streaming aggregation that counts words from a streaming DataFrame.
wordCounts = streamingDF.selectExpr("explode(split(value, ' ')) as word").groupBy([1]).count().writeStream.outputMode([2]).format("console").start()
Grouping by "word" counts each word, and "complete" mode outputs the full aggregation result.
Fill all three blanks to define a streaming query that reads JSON files, selects a column, and writes to memory sink.
streamingDF = spark.readStream.format([1]).load("/path/to/json") query = streamingDF.select([2]).writeStream.format([3]).start()
Use "json" format to read JSON files, select the "name" column, and write to "memory" sink for querying.