0
0
Apache Sparkdata~10 mins

Why join strategy affects Spark performance in Apache Spark - Test Your Understanding

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to perform a broadcast join in Spark.

Apache Spark
result = df1.join([1](df2), on='id')
Drag options to blanks, or click blank then click option'
Abroadcast
Bshuffle
Csort
Dcache
Attempts:
3 left
💡 Hint
Common Mistakes
Using shuffle join when the smaller DataFrame should be broadcasted.
Trying to use 'cache' or 'sort' instead of broadcast.
2fill in blank
medium

Complete the code to specify the join type as a shuffle hash join.

Apache Spark
joined_df = df1.join(df2, on='key', how=[1])
Drag options to blanks, or click blank then click option'
Abroadcast
Bshuffle_hash
Cinner
Dleft
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'broadcast' as a join type instead of a function.
Using 'shuffle_hash' which is not a valid join type string.
3fill in blank
hard

Fix the error in the code to avoid a costly shuffle join by broadcasting the smaller DataFrame.

Apache Spark
from pyspark.sql.functions import [1]
joined = df1.join([1](df2), 'id')
Drag options to blanks, or click blank then click option'
Ashuffle
Bbroadcast
Ccache
Dpersist
Attempts:
3 left
💡 Hint
Common Mistakes
Importing 'shuffle' which is not a function.
Using 'cache' or 'persist' which do not affect join strategy.
4fill in blank
hard

Fill both blanks to create a dictionary comprehension that filters words longer than 4 characters and squares their lengths.

Apache Spark
lengths = {word: len(word)[1]2 for word in words if len(word) [2] 4}
Drag options to blanks, or click blank then click option'
A**
B>
C<
D*
Attempts:
3 left
💡 Hint
Common Mistakes
Using '*' instead of '**' for power.
Using '<' instead of '>' in the condition.
5fill in blank
hard

Fill all three blanks to create a dictionary comprehension that uppercases keys, keeps values, and filters positive values.

Apache Spark
result = [1]: [2] for k, v in data.items() if v [3] 0}}
Drag options to blanks, or click blank then click option'
Ak
Bv
C>
Dk.upper()
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'k' instead of 'k.upper()' for keys.
Using '<' instead of '>' in the condition.