beginner

What is the main function to read a JSON file in Apache Spark?

You use spark.read.json(path) to load a JSON file into a DataFrame.

Click to reveal answer

beginner

How do you access nested fields in a JSON column in Spark?

You use dot notation like df.select("nestedField.subField") to access nested data.

Click to reveal answer

intermediate

What Spark function helps to flatten nested JSON structures?

You can use explode() to turn nested arrays into separate rows, helping flatten the data.

Click to reveal answer

beginner

How can you infer the schema automatically when reading JSON in Spark?

By default, spark.read.json() infers the schema from the JSON data automatically.

Click to reveal answer

intermediate

Why is it useful to understand the schema of nested JSON data before processing?

Knowing the schema helps you select, filter, and transform nested fields correctly without errors.

Click to reveal answer

Which Spark method reads a JSON file into a DataFrame?

Aspark.read.json()

Bspark.load.csv()

Cspark.read.text()

Dspark.load.json()

How do you select a nested field named 'address.city' from a DataFrame?

Adf.select('address_city')

Bdf.select('address-city')

Cdf.select('city')

Ddf.select('address.city')

What does the explode() function do in Spark?

AJoins two DataFrames

BFlattens nested arrays into multiple rows

CFilters rows based on condition

DAggregates data by key

If a JSON file has nested objects, how does Spark handle the schema by default?

AInfers nested schema automatically

BReads all data as strings

CFails to read nested data

DRequires manual schema definition

Why is schema knowledge important when working with nested JSON?

ATo convert JSON to CSV

BTo speed up file reading

CTo correctly select and transform nested fields

DTo delete nested data

Explain how to read a nested JSON file in Apache Spark and access a nested field.

Describe the purpose and use of the explode() function when working with nested JSON data in Spark.