Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to read data from a Hadoop data lake using Spark.

Hadoop

df = spark.read.format([1]).load("/data/lake/path")

Drag options to blanks, or click blank then click option'

Acsv

Bjson

Cxml

Dparquet

Attempts:

3 left

2fill in blank

medium

Complete the code to filter data for a specific year in the data lake.

Hadoop

filtered_df = df.filter(df.year == [1])

Drag options to blanks, or click blank then click option'

A2020

B"2020"

C'2020'

Dyear

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to write data back to the data lake in parquet format.

Hadoop

filtered_df.write.mode([1]).format("parquet").save("/data/lake/output")

Drag options to blanks, or click blank then click option'

Aappend

Badd

Cinsert

Dupdate

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary of word lengths for words longer than 3 characters.

Hadoop

lengths = {word: [1] for word in words if len(word) [2] 3}

Drag options to blanks, or click blank then click option'

Alen(word)

Dword

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a filtered dictionary with uppercase keys and values greater than 0.

Hadoop

result = [1]: [2] for k, v in data.items() if v [3] 0}

Drag options to blanks, or click blank then click option'

Ak.upper()

Dk.lower()

Attempts:

3 left