0
0
Apache Sparkdata~10 mins

Cross joins and when to avoid them in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to perform a cross join between two DataFrames df1 and df2.

Apache Spark
result = df1.[1](df2)
Drag options to blanks, or click blank then click option'
Aunion
Bjoin
CcrossJoin
Dselect
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'join' instead of 'crossJoin' which performs other types of joins.
Using 'union' which combines rows but does not join.
2fill in blank
medium

Complete the code to count the number of rows in the cross join result.

Apache Spark
count = df1.crossJoin(df2).[1]()
Drag options to blanks, or click blank then click option'
Acount
Bcollect
Cprint
Dshow
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'show' which displays rows but does not return count.
Using 'collect' which gathers data but does not count.
3fill in blank
hard

Complete the code to enable cross joins explicitly (required before using crossJoin).

Apache Spark
spark.conf.set('spark.sql.crossJoin.enabled', [1])
Drag options to blanks, or click blank then click option'
A'true'
B'false'
CTrue
DFalse
Attempts:
3 left
💡 Hint
Common Mistakes
Using boolean True or False instead of string 'true' or 'false'.
Setting the value to 'false' which disables cross joins.
4fill in blank
hard

Complete the code to calculate the expected number of rows in a cross join result (product of individual row counts).

Apache Spark
expected_rows = df1.[1]() * df2.[1]()
Drag options to blanks, or click blank then click option'
Acollect
Bcount
Cshow
Dsize
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'show' or 'collect', which do not return row counts.
There is no 'size()' method on DataFrame.
5fill in blank
hard

Complete the code to disable cross joins to prevent accidental performance issues with large datasets.

Apache Spark
spark.conf.set('spark.sql.crossJoin.enabled', [1])
Drag options to blanks, or click blank then click option'
Atrue
B'true'
Cfalse
D'false'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'true' or true which enables cross joins.
Using boolean false instead of string 'false'.