Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to specify the Parquet file format when saving a DataFrame in Spark.
Hadoop
df.write.format("[1]").save("/data/output")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'csv' or 'json' instead of 'parquet' will save in the wrong format.
Forgetting to specify the format causes default saving behavior.
✗ Incorrect
To save data in Parquet format, use format("parquet").
2fill in blank
mediumComplete the code to read an Avro file into a Spark DataFrame.
Hadoop
spark.read.format("[1]").load("/data/input.avro")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'parquet' or 'orc' format to read Avro files causes errors.
Not having the Avro package installed in Spark can cause failures.
✗ Incorrect
To read Avro files, use format("avro") with Spark.
3fill in blank
hardFix the error in the code to write a DataFrame in ORC format.
Hadoop
df.write.[1]("/data/output.orc")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using
saveAsTextFile for ORC files causes wrong output.Using non-existent methods like
saveAsOrcFile causes errors.✗ Incorrect
Use df.write.orc(path) to write ORC files in Spark.
4fill in blank
hardFill both blanks to create a DataFrame from a Parquet file and select only the 'name' column.
Hadoop
df = spark.read.[1]("/data/users.parquet").select("[2]")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'avro' format to read Parquet files causes errors.
Selecting a wrong column name returns empty or errors.
✗ Incorrect
Use parquet to read Parquet files and select the 'name' column.
5fill in blank
hardFill all three blanks to write a DataFrame in Avro format with overwrite mode and save to '/data/avro_output'.
Hadoop
df.write.mode("[1]").format("[2]").save("[3]")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'append' mode overwrites incorrectly.
Using wrong format causes save errors.
Saving to wrong path causes confusion.
✗ Incorrect
Use mode("overwrite") to overwrite, format("avro") for Avro, and the correct path to save.