0
0
Apache Sparkdata~5 mins

Column expressions and functions in Apache Spark - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is a column expression in Apache Spark?
A column expression is a way to refer to a column in a DataFrame to perform operations like selecting, filtering, or transforming data.
Click to reveal answer
beginner
How do you add two columns in a Spark DataFrame?
You can add two columns using the '+' operator on column expressions, for example: df.withColumn('sum', df['col1'] + df['col2']).
Click to reveal answer
beginner
What does the function 'lit()' do in Spark column expressions?
The 'lit()' function creates a column of a literal (constant) value, which can be used in expressions with other columns.
Click to reveal answer
intermediate
Explain the use of 'when' and 'otherwise' functions in Spark.
'when' allows conditional expressions on columns, similar to IF statements. 'otherwise' defines the value if the condition is false.
Click to reveal answer
intermediate
How can you chain multiple column functions in Spark?
You can chain functions by applying one after another on column expressions, for example: df.withColumn('new', col('a').cast('int').alias('new')).
Click to reveal answer
Which function creates a constant column in Spark?
Alit()
Bcol()
Cwhen()
Dselect()
How do you refer to a column named 'age' in Spark DataFrame?
Alit('age')
Bwhen('age')
Cdf['age']
Dselect('age')
What does the 'when' function do in Spark?
ASelects columns
BAdds two columns
CCreates a literal value
DCreates a conditional column expression
Which operator is used to add two columns in Spark?
A-
B+
C*
D/
How do you handle the 'else' part of a conditional expression in Spark?
Ausing otherwise()
Busing else()
Cusing elsewhen()
Dusing lit()
Describe how to create a new column in a Spark DataFrame using column expressions and functions.
Think about how you can combine columns and constants to make a new column.
You got /3 concepts.
    Explain how conditional logic is implemented in Spark column expressions.
    Consider how you choose values based on conditions in a DataFrame.
    You got /4 concepts.