Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to count the number of null values in the 'age' column of the DataFrame.

Apache Spark

null_count = df.filter(df['age'].[1]()).count()

Drag options to blanks, or click blank then click option'

AisNull

BisNotNull

Cdropna

Ddistinct

Attempts:

3 left

2fill in blank

medium

Complete the code to drop duplicate rows from the DataFrame.

Apache Spark

df_no_duplicates = df.[1]()

Drag options to blanks, or click blank then click option'

Adrop

BdropDuplicates

Cdistinct

Ddropna

Attempts:

3 left

3fill in blank

hard

Fix the code to filter rows where the 'salary' column is not null.

Apache Spark

filtered_df = df.filter(df['salary'].[1]())

Drag options to blanks, or click blank then click option'

Adropna

BisNull

Cdistinct

DisNotNull

Attempts:

3 left

4fill in blank

hard

Complete the code to drop duplicate rows based on the 'name' and 'age' columns.

Apache Spark

df_no_duplicates = df.dropDuplicates([[1], [2]])

Drag options to blanks, or click blank then click option'

A'name'

B'age'

C'salary'

Ddf['name']

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to count the number of duplicate groups based on 'name' and 'age' columns.

Apache Spark

dupe_groups_count = df.groupBy([1], [2]).count().filter(col('[3]') > 1).count()

Drag options to blanks, or click blank then click option'

A'name'

B'age'

Ccount

D'salary'

Attempts:

3 left