Complete the code to create a SparkSession named spark.
from pyspark.sql import SparkSession spark = SparkSession.builder.appName([1]).getOrCreate()
The appName method sets the name of your Spark application. It should be a string like "MyApp".
Complete the code to get the SparkContext from the SparkSession.
sc = spark.[1]The SparkSession object has a property called sparkContext that gives access to the SparkContext.
Fix the error in the code to submit a job to the cluster manager.
rdd = sc.parallelize([1, 2, 3, 4]) result = rdd.[1](lambda x: x * 2).collect()
The map function applies a transformation to each element in the RDD. Here, it multiplies each number by 2.
Fill both blanks to create a dictionary of executor IDs and their memory usage.
executor_info = {executor.[1]: executor.[2] for executor in sc._jsc.sc().getExecutorMemoryStatus().keySet()}The host is the executor's identifier, and memoryTotal gives the total memory allocated to that executor.
Fill all three blanks to filter executors with memory greater than 4GB and create a list of their hostnames.
hosts = [executor.[1] for executor, memory in sc._jsc.sc().getExecutorMemoryStatus().items() if memory [2] [3]]
We access the host attribute, filter memory greater than 4GB (4 * 1024^3 bytes), and use the greater than operator.