0
0
Apache Sparkdata~10 mins

Parquet format and columnar storage in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to read a Parquet file into a Spark DataFrame.

Apache Spark
df = spark.read.[1]("data/sample.parquet")
Drag options to blanks, or click blank then click option'
Acsv
Btext
Cjson
Dparquet
Attempts:
3 left
💡 Hint
Common Mistakes
Using csv or json method instead of parquet.
Forgetting to specify the file path.
2fill in blank
medium

Complete the code to write a DataFrame to Parquet format with overwrite mode.

Apache Spark
df.write.mode("[1]").parquet("output/path")
Drag options to blanks, or click blank then click option'
Aoverwrite
Bignore
Cerror
Dappend
Attempts:
3 left
💡 Hint
Common Mistakes
Using append mode when overwrite is needed.
Not specifying mode and causing errors if output exists.
3fill in blank
hard

Fix the error in the code to select only the 'name' and 'age' columns from a Parquet DataFrame.

Apache Spark
selected_df = df.select([1])
Drag options to blanks, or click blank then click option'
A"name", "age"
Bname, age
C["name", "age"]
D['name', 'age']
Attempts:
3 left
💡 Hint
Common Mistakes
Passing a list of column names instead of separate string arguments.
Using unquoted column names.
4fill in blank
hard

Fill both blanks to create a dictionary comprehension that maps each column name to its data type from a DataFrame schema.

Apache Spark
col_types = {col.name: col.[1] for col in df.schema.[2]
Drag options to blanks, or click blank then click option'
AdataType
Bfields
Ccolumns
Dtypes
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'columns' or 'types' which are not schema attributes.
Accessing dataType directly on schema instead of on fields.
5fill in blank
hard

Fill all three blanks to filter rows where the 'age' column is greater than 30 and select 'name' and 'age' columns.

Apache Spark
filtered_df = df.filter(df.[1] [2] [3]).select("name", "age")
Drag options to blanks, or click blank then click option'
Aage
B>
C30
Dage > 30
Attempts:
3 left
💡 Hint
Common Mistakes
Putting the entire condition as one string instead of separate parts.
Using wrong operators like '<' or '==' instead of '>'.