0
0
Apache Sparkdata~10 mins

Column expressions and functions in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to select the column 'age' from the DataFrame.

Apache Spark
df.select([1]).show()
Drag options to blanks, or click blank then click option'
Adf.age
Bage
Ccol("age")
D"age"
Attempts:
3 left
💡 Hint
Common Mistakes
Using the column variable without quotes causes an error.
Trying to access column as an attribute inside select.
2fill in blank
medium

Complete the code to create a new column 'age_plus_10' by adding 10 to the 'age' column.

Apache Spark
from pyspark.sql.functions import col

df = df.withColumn("age_plus_10", [1] + 10)
Drag options to blanks, or click blank then click option'
Acol("age")
Bdf.age
C"age"
Dage
Attempts:
3 left
💡 Hint
Common Mistakes
Using the column name as a string inside arithmetic operations.
Trying to use df.age directly in expressions.
3fill in blank
hard

Fix the error in the code to filter rows where 'score' is greater than 50.

Apache Spark
filtered_df = df.filter([1] > 50)
Drag options to blanks, or click blank then click option'
Acol("score")
Bscore
C"score"
Ddf.score
Attempts:
3 left
💡 Hint
Common Mistakes
Using the column name as a string inside filter.
Trying to use the variable name without col function.
4fill in blank
hard

Complete the code to create a new column 'high_score' by checking if 'score' > 90.

Apache Spark
from pyspark.sql.functions import col, lit

df = df.withColumn("high_score", [1]("score") > [2](90))
Drag options to blanks, or click blank then click option'
Acol
Blit
Cwhen
Dexpr
Attempts:
3 left
💡 Hint
Common Mistakes
Using '"score"' directly in the comparison.
Using the number 90 directly without lit().
5fill in blank
hard

Fill all three blanks to create 'new_age' by adding 10 to 'age' only if 'age' > 18, otherwise keep 'age'.

Apache Spark
from pyspark.sql.functions import col, lit, when

df = df.withColumn("new_age", when([1]("age") [2] 18, [1]("age") + [3](10)).otherwise([1]("age")))
Drag options to blanks, or click blank then click option'
Acol
B>
Clit
D<
Attempts:
3 left
💡 Hint
Common Mistakes
Passing string '"age"' instead of col("age").
Using '<' instead of '>' in the condition.
Using 10 directly instead of lit(10).