0
0
Apache Sparkdata~10 mins

Creating DataFrames from files (CSV, JSON, Parquet) in Apache Spark - Interactive Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to read a CSV file into a DataFrame.

Apache Spark
df = spark.read.format([1]).load("data/employees.csv")
Drag options to blanks, or click blank then click option'
A"csv"
B"text"
C"parquet"
D"json"
Attempts:
3 left
💡 Hint
Common Mistakes
Using "json" or "parquet" format for CSV files.
Forgetting to specify the format before loading.
2fill in blank
medium

Complete the code to read a JSON file into a DataFrame.

Apache Spark
df = spark.read.[1]("data/employees.json")
Drag options to blanks, or click blank then click option'
Acsv
Bparquet
Cjson
Dtext
Attempts:
3 left
💡 Hint
Common Mistakes
Using read.csv() or read.parquet() for JSON files.
Using read.text() which reads raw text lines.
3fill in blank
hard

Fix the error in the code to read a Parquet file into a DataFrame.

Apache Spark
df = spark.read.format("parquet").[1]("data/employees.parquet")
Drag options to blanks, or click blank then click option'
Acsv
Bread
Cjson
Dload
Attempts:
3 left
💡 Hint
Common Mistakes
Using read() instead of load() after format().
Using json() or csv() methods for Parquet files.
4fill in blank
hard

Fill both blanks to read a CSV file with header and infer schema options.

Apache Spark
df = spark.read.format("csv").option([1], "true").option([2], "true").load("data/employees.csv")
Drag options to blanks, or click blank then click option'
A"header"
B"inferSchema"
C"delimiter"
D"path"
Attempts:
3 left
💡 Hint
Common Mistakes
Using delimiter or path options incorrectly.
Not setting header or inferSchema to "true".
5fill in blank
hard

Fill both blanks to create a DataFrame from JSON with multiLine option enabled.

Apache Spark
df = spark.read.format([1]).option([2], "true").load("data/employees_multiline.json")
Drag options to blanks, or click blank then click option'
A"json"
B"multiLine"
C"csv"
D"parquet"
Attempts:
3 left
💡 Hint
Common Mistakes
Using csv or parquet format for JSON files.
Not setting the multiLine option when needed.