0
0
Apache Sparkdata~10 mins

Lazy evaluation in Spark in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create a Spark DataFrame from a list.

Apache Spark
data = [(1, 'Alice'), (2, 'Bob')]
spark_df = spark.createDataFrame([1], ['id', 'name'])
Drag options to blanks, or click blank then click option'
Adata
Bsc
Clist
Drdd
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'sc' which is the Spark context, not the data.
Using 'rdd' without converting the list first.
2fill in blank
medium

Complete the code to apply a filter transformation on the DataFrame.

Apache Spark
filtered_df = spark_df.filter(spark_df['id'] [1] 1)
Drag options to blanks, or click blank then click option'
A<=
B==
C!=
D>
Attempts:
3 left
💡 Hint
Common Mistakes
Using '==' which filters only id equal to 1.
Using '<=' which includes 1 and smaller values.
3fill in blank
hard

Fix the error in the code to trigger the lazy evaluation and show the results.

Apache Spark
result = filtered_df.[1]()
Drag options to blanks, or click blank then click option'
Ashow
Bfilter
Cselect
Dmap
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'filter' which is a transformation, not an action.
Using 'map' which is not a DataFrame method.
4fill in blank
hard

Fill both blanks to create a new DataFrame with selected columns and trigger computation.

Apache Spark
selected_df = spark_df.[1]('name')
selected_df.[2]()
Drag options to blanks, or click blank then click option'
Aselect
Bshow
Cfilter
Dcount
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'filter' instead of 'select' to pick columns.
Using 'count' which triggers computation but does not display data.
5fill in blank
hard

Fill all three blanks to create a filtered DataFrame, select a column, and count the rows.

Apache Spark
filtered = spark_df.filter(spark_df['id'] [1] 1)
selected = filtered.[2]('name')
row_count = selected.[3]()
Drag options to blanks, or click blank then click option'
A>
Bselect
Ccount
Dfilter
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'filter' instead of 'select' for the second blank.
Using 'show' instead of 'count' for the last blank.