Complete the code to read a Parquet file into a Spark DataFrame.
df = spark.read.[1]("data.parquet")
Parquet is a columnar storage format optimized for Spark. Using parquet reads the file correctly and efficiently.
Complete the code to write a DataFrame in an efficient columnar format.
df.write.[1]("output_path")
Writing data in Parquet format stores it in a columnar way, improving read and query speed.
Fix the error in the code to read a JSON file with Spark.
df = spark.read.[1]("data.json")
To read a JSON file, use spark.read.json(). Using other formats causes errors or wrong parsing.
Fill both blanks to create a dictionary of word lengths for words longer than 3 characters.
lengths = {word: [1] for word in words if len(word) [2] 3}The dictionary maps each word to its length using len(word). The condition filters words longer than 3 with >.
Fill all three blanks to create a dictionary with uppercase keys and values greater than zero.
result = { [1]: [2] for k, v in data.items() if v [3] 0 }The dictionary uses uppercase keys with k.upper(), values are v, and filters values greater than zero with >.