Challenge - 5 Problems
Date and Timestamp Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of date_add function in Spark
What is the output of the following Spark code snippet?
Apache Spark
from pyspark.sql import SparkSession from pyspark.sql.functions import date_add, to_date spark = SparkSession.builder.getOrCreate() data = [("2024-06-15",)] df = spark.createDataFrame(data, ["date_str"]) df = df.withColumn("date", to_date("date_str")) df = df.withColumn("new_date", date_add("date", 10)) df.select("new_date").show()
Attempts:
2 left
💡 Hint
date_add adds days to a date column.
✗ Incorrect
The date_add function adds the specified number of days (10) to the original date (2024-06-15), resulting in 2024-06-25.
❓ data_output
intermediate2:00remaining
Result of truncating timestamp to month
Given the following Spark DataFrame, what is the output after truncating the timestamp to the month?
Apache Spark
from pyspark.sql import SparkSession from pyspark.sql.functions import trunc, to_timestamp spark = SparkSession.builder.getOrCreate() data = [("2024-06-15 13:45:30",)] df = spark.createDataFrame(data, ["ts_str"]) df = df.withColumn("ts", to_timestamp("ts_str")) df = df.withColumn("month_start", trunc("ts", "MM")) df.select("month_start").show()
Attempts:
2 left
💡 Hint
trunc with 'MM' returns the first day of the month.
✗ Incorrect
The trunc function with 'MM' truncates the timestamp to the first day of its month, so 2024-06-15 13:45:30 becomes 2024-06-01.
🔧 Debug
advanced2:00remaining
Identify the error in timestamp conversion
What error will the following Spark code raise when trying to convert a string to timestamp?
Apache Spark
from pyspark.sql import SparkSession from pyspark.sql.functions import to_timestamp spark = SparkSession.builder.getOrCreate() data = [("2024-13-01 10:00:00",)] df = spark.createDataFrame(data, ["ts_str"]) df = df.withColumn("ts", to_timestamp("ts_str")) df.show()
Attempts:
2 left
💡 Hint
Invalid dates in Spark timestamp conversion result in nulls, not exceptions.
✗ Incorrect
Spark's to_timestamp returns null for invalid date strings instead of raising an error.
🧠 Conceptual
advanced2:00remaining
Understanding unix_timestamp function output
What does the unix_timestamp function return when applied to a timestamp column in Spark?
Attempts:
2 left
💡 Hint
unix_timestamp returns seconds, not milliseconds.
✗ Incorrect
unix_timestamp returns the number of seconds since the Unix epoch (1970-01-01 00:00:00 UTC) as a long integer.
🚀 Application
expert3:00remaining
Calculate age in years from birthdate column
Given a Spark DataFrame with a birthdate column of type date, which code snippet correctly calculates the age in years as an integer?
Attempts:
2 left
💡 Hint
Use months_between and floor to get accurate age in years.
✗ Incorrect
Option C uses months_between to get months difference, divides by 12, and floors the result to get full years accurately.