0
0
Apache Sparkdata~10 mins

Schema validation in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create a schema with a string field named 'name'.

Apache Spark
from pyspark.sql.types import StructType, StructField, [1]
schema = StructType([StructField('name', [1](), True)])
Drag options to blanks, or click blank then click option'
ABooleanType
BIntegerType
CStringType
DFloatType
Attempts:
3 left
💡 Hint
Common Mistakes
Using IntegerType or other types instead of StringType for text fields.
2fill in blank
medium

Complete the code to validate a DataFrame against the schema.

Apache Spark
df = spark.createDataFrame(data, schema=[1])
Drag options to blanks, or click blank then click option'
Acolumns
Bdata
Cfields
Dschema
Attempts:
3 left
💡 Hint
Common Mistakes
Passing data or columns instead of the schema to the DataFrame constructor.
3fill in blank
hard

Fix the error in the code to import the correct type for an integer field.

Apache Spark
from pyspark.sql.types import StructType, StructField, [1]
schema = StructType([StructField('age', IntegerType(), True)])
Drag options to blanks, or click blank then click option'
AInteger
BIntegerType
CInt
DIntType
Attempts:
3 left
💡 Hint
Common Mistakes
Using incorrect or non-existent types like IntType or Integer.
4fill in blank
hard

Fill both blanks to create a schema with a string field 'city' and an integer field 'population'.

Apache Spark
schema = StructType([
    StructField('city', [1](), True),
    StructField('population', [2](), True)
])
Drag options to blanks, or click blank then click option'
AStringType
BFloatType
CIntegerType
DBooleanType
Attempts:
3 left
💡 Hint
Common Mistakes
Using FloatType for population instead of IntegerType.
Using BooleanType for city.
5fill in blank
hard

Fill all three blanks to create a schema with fields: 'id' (integer), 'name' (string), and 'active' (boolean).

Apache Spark
schema = StructType([
    StructField('id', [1](), True),
    StructField('name', [2](), True),
    StructField('active', [3](), True)
])
Drag options to blanks, or click blank then click option'
AStringType
BBooleanType
CIntegerType
DFloatType
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing up types like using FloatType for id or BooleanType for name.