Complete the code to read a Parquet file into a Spark DataFrame.
df = spark.read.[1]("data/sample.parquet")
The parquet method reads Parquet files into a DataFrame.
Complete the code to write a DataFrame to Parquet format with overwrite mode.
df.write.mode("[1]").parquet("output/path")
The overwrite mode replaces existing data at the output path.
Fix the error in the code to select only the 'name' and 'age' columns from a Parquet DataFrame.
selected_df = df.select([1])The select method accepts multiple column names as separate string arguments.
Fill both blanks to create a dictionary comprehension that maps each column name to its data type from a DataFrame schema.
col_types = {col.name: col.[1] for col in df.schema.[2]The schema's fields list contains column objects, each with a dataType attribute.
Fill all three blanks to filter rows where the 'age' column is greater than 30 and select 'name' and 'age' columns.
filtered_df = df.filter(df.[1] [2] [3]).select("name", "age")
Use the column name age, the operator >, and the value 30 to filter rows.