Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to import the window function module in PySpark.

Apache Spark

from pyspark.sql import [1]

Drag options to blanks, or click blank then click option'

Asql

Bwindow

Cfunctions

DWindow

Attempts:

3 left

2fill in blank

medium

Complete the code to create a window specification partitioned by 'department'.

Apache Spark

from pyspark.sql import Window
windowSpec = Window.partitionBy([1])

Drag options to blanks, or click blank then click option'

A'department'

B'salary'

C'age'

D'name'

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to calculate the row number over the window specification.

Apache Spark

from pyspark.sql.functions import row_number
result = df.withColumn('row_num', row_number().over([1]))

Drag options to blanks, or click blank then click option'

Awindow

BwindowSpec

CWindow

Dwindow_spec

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a window specification partitioned by 'department' and ordered by 'salary' descending.

Apache Spark

windowSpec = Window.partitionBy([1]).orderBy(col([2]).desc())

Drag options to blanks, or click blank then click option'

A'department'

B'salary'

C'age'

D'name'

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to calculate the cumulative sum of 'sales' partitioned by 'region' and ordered by 'date'.

Apache Spark

from pyspark.sql.functions import sum
windowSpec = Window.partitionBy([1]).orderBy([2])
cum_sum = sum([3]).over(windowSpec)

Drag options to blanks, or click blank then click option'

A'region'

B'date'

C'sales'

D'profit'

Attempts:

3 left