Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to read a CSV file into a DataFrame.
Apache Spark
df = spark.read.format([1]).load("data/employees.csv")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using "json" or "parquet" format for CSV files.
Forgetting to specify the format before loading.
✗ Incorrect
The format method specifies the file type. For CSV files, use "csv".
2fill in blank
mediumComplete the code to read a JSON file into a DataFrame.
Apache Spark
df = spark.read.[1]("data/employees.json")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using
read.csv() or read.parquet() for JSON files.Using
read.text() which reads raw text lines.✗ Incorrect
The read.json() method reads JSON files directly into a DataFrame.
3fill in blank
hardFix the error in the code to read a Parquet file into a DataFrame.
Apache Spark
df = spark.read.format("parquet").[1]("data/employees.parquet")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using
read() instead of load() after format().Using
json() or csv() methods for Parquet files.✗ Incorrect
The load() method loads the file after specifying the format.
4fill in blank
hardFill both blanks to read a CSV file with header and infer schema options.
Apache Spark
df = spark.read.format("csv").option([1], "true").option([2], "true").load("data/employees.csv")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using
delimiter or path options incorrectly.Not setting
header or inferSchema to "true".✗ Incorrect
To read CSV with headers and infer data types, use options header and inferSchema set to "true".
5fill in blank
hardFill both blanks to create a DataFrame from JSON with multiLine option enabled.
Apache Spark
df = spark.read.format([1]).option([2], "true").load("data/employees_multiline.json")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using
csv or parquet format for JSON files.Not setting the multiLine option when needed.
✗ Incorrect
To read multiline JSON files, specify format as "json" and set option multiLine to "true".