0
0
Apache Sparkdata~10 mins

Writing output with partitioning in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to write the DataFrame as a parquet file.

Apache Spark
df.write.parquet([1])
Drag options to blanks, or click blank then click option'
Aparquet
Bwrite
Cdf
D"/path/to/output"
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the DataFrame itself instead of a path.
Using the method name as a string.
Not providing any argument.
2fill in blank
medium

Complete the code to partition the output by the column 'year'.

Apache Spark
df.write.partitionBy([1]).parquet("/output/path")
Drag options to blanks, or click blank then click option'
Ayear
B"month"
C"year"
DpartitionBy
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the column name without quotes.
Using the wrong column name.
Using the method name as argument.
3fill in blank
hard

Fix the error in the code to write partitioned data by 'country'.

Apache Spark
df.write.partitionBy([1]).save("/data/output")
Drag options to blanks, or click blank then click option'
A"country"
Bsave
CpartitionBy
Dcountry
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the column name without quotes.
Passing the method name instead of the column name.
Using the save method incorrectly.
4fill in blank
hard

Fill both blanks to write the DataFrame partitioned by 'state' and saved as JSON.

Apache Spark
df.write.[1]By([2]).json("/json/output")
Drag options to blanks, or click blank then click option'
Apartition
B"state"
Dsave
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'partition' instead of 'partitionBy'.
Not quoting the column name.
Using 'save' instead of 'json'.
5fill in blank
hard

Fill all three blanks to write the DataFrame partitioned by 'category', saved as parquet, and overwrite existing data.

Apache Spark
df.write.mode([1]).[2]By([3]).parquet("/final/output")
Drag options to blanks, or click blank then click option'
A"append"
B"overwrite"
C"category"
Dpartition
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'append' mode instead of 'overwrite'.
Not using 'partitionBy' method.
Passing the column name without quotes.