Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to check if the DataFrame has any null values.

Apache Spark

df.selectExpr('count(*) as total', 'count([1]) as non_null').show()

Drag options to blanks, or click blank then click option'

Anull

Bcolumn_name

Dcount

Attempts:

3 left

2fill in blank

medium

Complete the code to assert that the column 'age' has no null values.

Apache Spark

assert df.filter(df.age.[1](None)).count() == 0, 'Null values found in age column'

Drag options to blanks, or click blank then click option'

AisNotNull

BisNull

Cisnan

DisEmpty

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to assert that all values in 'salary' are positive.

Apache Spark

assert df.filter(df.salary [1] 0).count() == 0, 'Negative or zero salary found'

Drag options to blanks, or click blank then click option'

A<=

B>=

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary of counts for each unique value in 'department' where count is greater than 5.

Apache Spark

dept_counts = {row['[1]']: row['[2]'] for row in df.groupBy('department').count().collect() if row['count'] > 5}

Drag options to blanks, or click blank then click option'

Adepartment

Bcount

Cdept

Dvalue

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a filtered DataFrame with no nulls in 'email' and 'phone' columns and only rows where 'age' is greater than 18.

Apache Spark

filtered_df = df.filter(df.email.[1]() & df.phone.[2]() & (df.age [3] 18))

Drag options to blanks, or click blank then click option'

AisNotNull

CisNull

Attempts:

3 left