Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to read a CSV file with automatic schema inference.

Apache Spark

df = spark.read.option("header", "true").option("inferSchema", [1]).csv("data.csv")

Drag options to blanks, or click blank then click option'

A"true"

BTrue

CFalse

D"false"

Attempts:

3 left

2fill in blank

medium

Complete the code to define a schema with a string field 'name' and integer field 'age'.

Apache Spark

from pyspark.sql.types import StructType, StructField, [1], IntegerType
schema = StructType([
    StructField("name", StringType(), True),
    StructField("age", IntegerType(), True)
])

Drag options to blanks, or click blank then click option'

AFloatType

BStringType

CBooleanType

DDateType

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to apply the schema when reading a JSON file.

Apache Spark

df = spark.read.schema([1]).json("data.json")

Drag options to blanks, or click blank then click option'

Aschema

BSchema

Cstruct

DStructType

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a schema with a non-nullable integer field 'id' and a nullable string field 'email'.

Apache Spark

schema = StructType([
    StructField("id", [1](), [2]),
    StructField("email", StringType(), True)
])

Drag options to blanks, or click blank then click option'

AIntegerType

BFalse

CTrue

DStringType

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a DataFrame with a schema and show its schema.

Apache Spark

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("TestApp").getOrCreate()
schema = StructType([
    StructField("name", [1](), True),
    StructField("age", [2](), True)
])
data = [("Alice", 30), ("Bob", 25)]
df = spark.createDataFrame(data, schema=[3])
df.printSchema()

Drag options to blanks, or click blank then click option'

AStringType

BIntegerType

Cschema

DStructType

Attempts:

3 left