Challenge - 5 Problems
Type Casting and Null Handling Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of type casting with null values in Spark DataFrame
What is the output of the following Spark code snippet?
Apache Spark
from pyspark.sql import SparkSession from pyspark.sql.functions import col spark = SparkSession.builder.getOrCreate() data = [("1", "100"), ("2", None), (None, "300")] df = spark.createDataFrame(data, ["id", "value"]) df_cast = df.select(col("id").cast("int").alias("id_int"), col("value").cast("int").alias("value_int")) df_cast.show()
Attempts:
2 left
💡 Hint
Casting null strings to int results in null values in Spark DataFrame.
✗ Incorrect
When casting string columns to integer in Spark, null string values become null integers. Non-null strings that represent numbers convert correctly.
❓ data_output
intermediate1:30remaining
Count of nulls after type casting in Spark DataFrame
Given the DataFrame below, what is the count of null values in the 'age_int' column after casting 'age' from string to integer?
Apache Spark
from pyspark.sql import SparkSession from pyspark.sql.functions import col, isnan, when, count spark = SparkSession.builder.getOrCreate() data = [("25",), ("30",), (None,), ("",), ("40",)] df = spark.createDataFrame(data, ["age"]) df_cast = df.select(col("age").cast("int").alias("age_int")) null_count = df_cast.filter(col("age_int").isNull()).count() print(null_count)
Attempts:
2 left
💡 Hint
Empty string and None cast to null integer.
✗ Incorrect
Both None and empty string values become null after casting to integer, so count is 2.
🔧 Debug
advanced2:00remaining
Identify the error in type casting with null handling
What error will this Spark code raise?
Apache Spark
from pyspark.sql import SparkSession from pyspark.sql.functions import col spark = SparkSession.builder.getOrCreate() data = [("100",), ("abc",), (None,)] df = spark.createDataFrame(data, ["score"]) df_cast = df.select(col("score").cast("int").alias("score_int")) df_cast.show()
Attempts:
2 left
💡 Hint
Invalid strings become null after casting in Spark, no error is raised.
✗ Incorrect
Spark casts invalid strings to null silently without error during cast to int.
🚀 Application
advanced2:30remaining
Handling nulls after type casting for aggregation
You have a Spark DataFrame with a string column 'price' containing numeric strings and nulls. You want to compute the average price as a float. Which code snippet correctly handles nulls after casting?
Attempts:
2 left
💡 Hint
Aggregation functions in Spark ignore nulls automatically.
✗ Incorrect
Casting to float and using avg directly works because avg ignores nulls. Filling nulls with 0 or dropping rows is unnecessary and can distort results.
🧠 Conceptual
expert1:30remaining
Effect of type casting on null handling in Spark SQL expressions
In Spark SQL, what happens when you cast a string column containing empty strings and nulls to integer? Choose the most accurate statement.
Attempts:
2 left
💡 Hint
Spark treats empty strings as nulls during cast to numeric types.
✗ Incorrect
Casting empty strings or nulls to integer results in null values without errors in Spark SQL.