Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to perform a broadcast join in Spark.

Apache Spark

result = df1.join([1](df2), on='id')

Drag options to blanks, or click blank then click option'

Abroadcast

Bshuffle

Csort

Dcache

Attempts:

3 left

2fill in blank

medium

Complete the code to specify the join type as a shuffle hash join.

Apache Spark

joined_df = df1.join(df2, on='key', how=[1])

Drag options to blanks, or click blank then click option'

Abroadcast

Bshuffle_hash

Cinner

Dleft

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to avoid a costly shuffle join by broadcasting the smaller DataFrame.

Apache Spark

from pyspark.sql.functions import [1]
joined = df1.join([1](df2), 'id')

Drag options to blanks, or click blank then click option'

Ashuffle

Bbroadcast

Ccache

Dpersist

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary comprehension that filters words longer than 4 characters and squares their lengths.

Apache Spark

lengths = {word: len(word)[1]2 for word in words if len(word) [2] 4}

Drag options to blanks, or click blank then click option'

A**

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a dictionary comprehension that uppercases keys, keeps values, and filters positive values.

Apache Spark

result = [1]: [2] for k, v in data.items() if v [3] 0}}

Drag options to blanks, or click blank then click option'

Dk.upper()

Attempts:

3 left