0
0
Apache Sparkdata~10 mins

Spark architecture (driver, executors, cluster manager) in Apache Spark - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create a SparkSession named spark.

Apache Spark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName([1]).getOrCreate()
Drag options to blanks, or click blank then click option'
A"MyApp"
BSparkContext
CExecutor
DClusterManager
Attempts:
3 left
💡 Hint
Common Mistakes
Using SparkContext instead of a string for appName
Not using quotes around the app name
2fill in blank
medium

Complete the code to get the SparkContext from the SparkSession.

Apache Spark
sc = spark.[1]
Drag options to blanks, or click blank then click option'
Acontext
BsparkContext
CSparkContext
DgetContext
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'SparkContext' with uppercase S and C
Using 'context' which does not exist
3fill in blank
hard

Fix the error in the code to submit a job to the cluster manager.

Apache Spark
rdd = sc.parallelize([1, 2, 3, 4])
result = rdd.[1](lambda x: x * 2).collect()
Drag options to blanks, or click blank then click option'
AflatMap
Bfilter
Creduce
Dmap
Attempts:
3 left
💡 Hint
Common Mistakes
Using filter which selects elements instead of transforming
Using reduce which aggregates elements
4fill in blank
hard

Fill both blanks to create a dictionary of executor IDs and their memory usage.

Apache Spark
executor_info = {executor.[1]: executor.[2] for executor in sc._jsc.sc().getExecutorMemoryStatus().keySet()}
Drag options to blanks, or click blank then click option'
Aid
BmemoryUsed
Chost
DmemoryTotal
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'id' which is not a valid attribute here
Using 'memoryUsed' instead of 'memoryTotal'
5fill in blank
hard

Fill all three blanks to filter executors with memory greater than 4GB and create a list of their hostnames.

Apache Spark
hosts = [executor.[1] for executor, memory in sc._jsc.sc().getExecutorMemoryStatus().items() if memory [2] [3]]
Drag options to blanks, or click blank then click option'
Ahost
B>
C4 * 1024 * 1024 * 1024
Did
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'id' instead of 'host' for executor identifier
Using '<' instead of '>' for filtering
Not converting 4GB to bytes