0
0
Apache Sparkdata~20 mins

Type casting and null handling in Apache Spark - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Type Casting and Null Handling Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of type casting with null values in Spark DataFrame
What is the output of the following Spark code snippet?
Apache Spark
from pyspark.sql import SparkSession
from pyspark.sql.functions import col

spark = SparkSession.builder.getOrCreate()
data = [("1", "100"), ("2", None), (None, "300")]
df = spark.createDataFrame(data, ["id", "value"])
df_cast = df.select(col("id").cast("int").alias("id_int"), col("value").cast("int").alias("value_int"))
df_cast.show()
A
+------+---------+
|id_int|value_int|
+------+---------+
|     1|      100|
|     2|     null|
|  null|      300|
+------+---------+
B
+------+---------+
|id_int|value_int|
+------+---------+
|     1|      100|
|     2|        0|
|     0|      300|
+------+---------+
CRuntimeError: Cannot cast null values to int
D
+------+---------+
|id_int|value_int|
+------+---------+
|     1|     null|
|     2|     null|
|  null|     null|
+------+---------+
Attempts:
2 left
💡 Hint
Casting null strings to int results in null values in Spark DataFrame.
data_output
intermediate
1:30remaining
Count of nulls after type casting in Spark DataFrame
Given the DataFrame below, what is the count of null values in the 'age_int' column after casting 'age' from string to integer?
Apache Spark
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, isnan, when, count

spark = SparkSession.builder.getOrCreate()
data = [("25",), ("30",), (None,), ("",), ("40",)]
df = spark.createDataFrame(data, ["age"])
df_cast = df.select(col("age").cast("int").alias("age_int"))
null_count = df_cast.filter(col("age_int").isNull()).count()
print(null_count)
A1
B2
C3
D0
Attempts:
2 left
💡 Hint
Empty string and None cast to null integer.
🔧 Debug
advanced
2:00remaining
Identify the error in type casting with null handling
What error will this Spark code raise?
Apache Spark
from pyspark.sql import SparkSession
from pyspark.sql.functions import col

spark = SparkSession.builder.getOrCreate()
data = [("100",), ("abc",), (None,)]
df = spark.createDataFrame(data, ["score"])
df_cast = df.select(col("score").cast("int").alias("score_int"))
df_cast.show()
AValueError: invalid literal for int() with base 10: 'abc'
BRuntimeError: Null values not allowed in cast
CTypeError: Cannot cast string to int
D
+---------+
|score_int|
+---------+
|      100|
|     null|
|     null|
+---------+
Attempts:
2 left
💡 Hint
Invalid strings become null after casting in Spark, no error is raised.
🚀 Application
advanced
2:30remaining
Handling nulls after type casting for aggregation
You have a Spark DataFrame with a string column 'price' containing numeric strings and nulls. You want to compute the average price as a float. Which code snippet correctly handles nulls after casting?
Adf.selectExpr('avg(cast(price as float)) as avg_price').na.drop().show()
Bdf.selectExpr('avg(price) as avg_price').show()
Cdf.selectExpr('avg(cast(price as float)) as avg_price').show()
Ddf.selectExpr('avg(cast(price as int)) as avg_price').na.fill(0).show()
Attempts:
2 left
💡 Hint
Aggregation functions in Spark ignore nulls automatically.
🧠 Conceptual
expert
1:30remaining
Effect of type casting on null handling in Spark SQL expressions
In Spark SQL, what happens when you cast a string column containing empty strings and nulls to integer? Choose the most accurate statement.
AEmpty strings and nulls both become null integers without error.
BEmpty strings cause a runtime error, nulls become zero.
CEmpty strings become zero, nulls cause a runtime error.
DEmpty strings and nulls both become zero integers silently.
Attempts:
2 left
💡 Hint
Spark treats empty strings as nulls during cast to numeric types.