Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to read streaming data from a socket source.

Apache Spark

streamingDF = spark.readStream.format([1]).option("host", "localhost").option("port", 9999).load()

Drag options to blanks, or click blank then click option'

A"json"

B"csv"

C"socket"

D"parquet"

Attempts:

3 left

2fill in blank

medium

Complete the code to write streaming data to the console sink.

Apache Spark

query = streamingDF.writeStream.format([1]).outputMode("append").start()

Drag options to blanks, or click blank then click option'

A"parquet"

B"console"

C"memory"

D"csv"

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to correctly define the streaming aggregation.

Apache Spark

aggDF = streamingDF.groupBy("category").agg([1]("value"))

Drag options to blanks, or click blank then click option'

Asum

BcountDistinct

Ccollect_list

Dmax

Attempts:

3 left

4fill in blank

hard

Fill both blanks to filter streaming data for values greater than 100 and select only the 'category' and 'value' columns.

Apache Spark

filteredDF = streamingDF.filter(streamingDF.value [1] 100).select([2], "value")

Drag options to blanks, or click blank then click option'

B"category"

D"timestamp"

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a streaming query that writes aggregated data to memory with complete output mode and a query name 'aggQuery'.

Apache Spark

query = aggDF.writeStream.format([1]).outputMode([2]).queryName([3]).start()

Drag options to blanks, or click blank then click option'

A"memory"

B"complete"

C"aggQuery"

D"append"

Attempts:

3 left