Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to cache the DataFrame to optimize performance.

Apache Spark

df = spark.read.csv('data.csv')
df.[1]()

Drag options to blanks, or click blank then click option'

Ashow

Bcollect

Ccache

Dcount

Attempts:

3 left

2fill in blank

medium

Complete the code to repartition the DataFrame for better parallelism.

Apache Spark

df = df.[1](10)

Drag options to blanks, or click blank then click option'

Arepartition

Bpersist

Ccollect

Dcache

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to avoid job failure by using the correct action.

Apache Spark

result = df.filter(df.age > 30)
result.[1]()

Drag options to blanks, or click blank then click option'

Apersist

Bshow

Ccache

Dmap

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary of word lengths for words longer than 3 characters.

Apache Spark

lengths = {word: [1] for word in words if len(word) [2] 3}

Drag options to blanks, or click blank then click option'

Alen(word)

Dword

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a filtered dictionary with uppercase keys for words longer than 4 characters.

Apache Spark

result = [1]([2]: [3] for [2] in words if len([2]) > 4)

Drag options to blanks, or click blank then click option'

Adict

Bword

Cword.upper()

Dlist

Attempts:

3 left