0
0
Apache Sparkdata~10 mins

Creating RDDs from collections and files in Apache Spark - Interactive Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create an RDD from a Python list.

Apache Spark
data = [1, 2, 3, 4, 5]
rdd = sc.[1](data)
Drag options to blanks, or click blank then click option'
Aparallelize
BtextFile
Cread
Dcollect
Attempts:
3 left
💡 Hint
Common Mistakes
Using textFile instead of parallelize for a list.
Trying to use collect to create an RDD.
2fill in blank
medium

Complete the code to create an RDD by reading a text file.

Apache Spark
rdd = sc.[1]("/path/to/file.txt")
Drag options to blanks, or click blank then click option'
Aparallelize
BreadFile
Cload
DtextFile
Attempts:
3 left
💡 Hint
Common Mistakes
Using parallelize to read a file path string.
Using readFile which is not a Spark method.
3fill in blank
hard

Fix the error in the code to create an RDD from a list of numbers.

Apache Spark
numbers = [10, 20, 30]
rdd = sc.[1](numbers).collect()
Drag options to blanks, or click blank then click option'
Aparallelize
BtextFile
Cread
Dload
Attempts:
3 left
💡 Hint
Common Mistakes
Using textFile on a list causes errors.
Trying to use read or load which are not valid here.
4fill in blank
hard

Fill both blanks to create an RDD from a list and filter even numbers.

Apache Spark
data = [1, 2, 3, 4, 5]
rdd = sc.[1](data).filter(lambda x: x [2] 2 == 0)
Drag options to blanks, or click blank then click option'
Aparallelize
B%
C//
D==
Attempts:
3 left
💡 Hint
Common Mistakes
Using textFile instead of parallelize.
Using // instead of % for modulus.
5fill in blank
hard

Fill all three blanks to create an RDD from a file, map each line to uppercase, and collect the results.

Apache Spark
rdd = sc.[1]("/path/to/file.txt")
upper_rdd = rdd.[2](lambda line: line.[3]())
result = upper_rdd.collect()
Drag options to blanks, or click blank then click option'
AtextFile
Bmap
Cupper
Dparallelize
Attempts:
3 left
💡 Hint
Common Mistakes
Using parallelize instead of textFile for reading files.
Using map incorrectly or missing it.
Using lower instead of upper for uppercase conversion.