Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to create a schema with a string field named 'name'.
Apache Spark
from pyspark.sql.types import StructType, StructField, [1] schema = StructType([StructField('name', [1](), True)])
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using IntegerType or other types instead of StringType for text fields.
✗ Incorrect
The StringType is used to define a string field in the schema.
2fill in blank
mediumComplete the code to validate a DataFrame against the schema.
Apache Spark
df = spark.createDataFrame(data, schema=[1]) Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing data or columns instead of the schema to the DataFrame constructor.
✗ Incorrect
The schema parameter is used to apply the schema when creating a DataFrame.
3fill in blank
hardFix the error in the code to import the correct type for an integer field.
Apache Spark
from pyspark.sql.types import StructType, StructField, [1] schema = StructType([StructField('age', IntegerType(), True)])
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using incorrect or non-existent types like IntType or Integer.
✗ Incorrect
The correct Spark SQL type for integers is IntegerType.
4fill in blank
hardFill both blanks to create a schema with a string field 'city' and an integer field 'population'.
Apache Spark
schema = StructType([
StructField('city', [1](), True),
StructField('population', [2](), True)
]) Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using FloatType for population instead of IntegerType.
Using BooleanType for city.
✗ Incorrect
Use StringType for 'city' and IntegerType for 'population'.
5fill in blank
hardFill all three blanks to create a schema with fields: 'id' (integer), 'name' (string), and 'active' (boolean).
Apache Spark
schema = StructType([
StructField('id', [1](), True),
StructField('name', [2](), True),
StructField('active', [3](), True)
]) Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing up types like using FloatType for id or BooleanType for name.
✗ Incorrect
Use IntegerType for 'id', StringType for 'name', and BooleanType for 'active'.