Recall & Review
beginner
What is a window function in Apache Spark?
A window function performs calculations across a set of rows related to the current row without collapsing the rows into a single output row. It lets you do things like running totals or ranking within groups.
Click to reveal answer
beginner
What does the
partitionBy clause do in a window specification?It divides the data into groups (partitions) so the window function works independently within each group, like calculating ranks separately for each department in a company.
Click to reveal answer
intermediate
Explain the difference between
orderBy and partitionBy in window functions.partitionBy groups rows into separate sets. orderBy sorts rows within each partition to define the order for calculations like running totals or ranks.Click to reveal answer
intermediate
What is the purpose of the
rowsBetween method in window functions?It defines the frame of rows around the current row to include in the calculation, for example, the previous 2 rows and the next 2 rows, helping to calculate moving averages.
Click to reveal answer
beginner
Give an example of a common window function in Spark and its use.
The
rank() function assigns a rank to each row within a partition ordered by a column. For example, ranking salespeople by sales amount within each region.Click to reveal answer
What does a window function NOT do?
✗ Incorrect
Window functions do calculations across rows but keep all rows in the output; they do not collapse rows into one.
Which clause in a window specification divides data into groups?
✗ Incorrect
partitionBy splits data into groups for independent window calculations.What does
orderBy do inside a window function?✗ Incorrect
orderBy sorts rows within each partition to define calculation order.Which method defines the frame of rows around the current row?
✗ Incorrect
rowsBetween sets the range of rows to include in the window frame.What would
rank() do in a window function?✗ Incorrect
rank() assigns ranks to rows ordered within each partition.Explain how window functions differ from regular aggregation functions in Spark.
Think about whether rows are combined or kept separate.
You got /4 concepts.
Describe how you would use
partitionBy and orderBy together in a window function.Consider grouping first, then sorting inside groups.
You got /4 concepts.