0
0
Apache Sparkdata~10 mins

Date and timestamp functions in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to extract the year from the 'date' column.

Apache Spark
from pyspark.sql.functions import [1]
df.select([1]("date")).show()
Drag options to blanks, or click blank then click option'
Ayear
Bhour
Cdayofmonth
Dmonth
Attempts:
3 left
💡 Hint
Common Mistakes
Using month or dayofmonth instead of year.
Forgetting to import the function.
2fill in blank
medium

Complete the code to add 5 days to the 'date' column.

Apache Spark
from pyspark.sql.functions import date_add
df.select(date_add("date", [1])).show()
Drag options to blanks, or click blank then click option'
A5
B3
C10
D1
Attempts:
3 left
💡 Hint
Common Mistakes
Using a wrong number of days.
Passing the number as a string instead of an integer.
3fill in blank
hard

Fix the error in the code to calculate the difference in days between 'date1' and 'date2'.

Apache Spark
from pyspark.sql.functions import datediff
df.select(datediff("date1", [1])).show()
Drag options to blanks, or click blank then click option'
A"date1"
Bdate1
C"date2"
Ddate2
Attempts:
3 left
💡 Hint
Common Mistakes
Passing a variable instead of a string.
Using the same column name twice.
4fill in blank
hard

Fill both blanks to create a new column 'hour' extracting the hour from 'timestamp' and filter rows where hour is greater than 12.

Apache Spark
from pyspark.sql.functions import [1]
df.select([1]("timestamp").alias("hour")).filter("hour [2] 12").show()
Drag options to blanks, or click blank then click option'
Ahour
B>
C<
Dminute
Attempts:
3 left
💡 Hint
Common Mistakes
Using minute instead of hour.
Using less than instead of greater than in filter.
5fill in blank
hard

Fill all three blanks to format the 'date' column as 'yyyy-MM' into a new column 'month_year' and filter rows where month_year > '2023-06'.

Apache Spark
from pyspark.sql.functions import [1]
df.withColumn("month_year", [1]([2], [3])).filter("month_year > '2023-06'").show()
Drag options to blanks, or click blank then click option'
Adate_format
B"date"
C"yyyy-MM"
Dyear
Attempts:
3 left
💡 Hint
Common Mistakes
Using an extraction function like year instead of date_format.
Forgetting quotes around the column name or format string.
Using incorrect format pattern.