Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to create a SparkSession running in local mode.

Apache Spark

from pyspark.sql import SparkSession
spark = SparkSession.builder.master([1]).appName("LocalApp").getOrCreate()

Drag options to blanks, or click blank then click option'

A"local"

B"cluster"

C"remote"

D"distributed"

Attempts:

3 left

2fill in blank

medium

Complete the code to read a CSV file using Spark in cluster mode.

Apache Spark

df = spark.read.format("csv").option("header", "true").load([1])

Drag options to blanks, or click blank then click option'

A"hdfs://namenode:9000/data/data.csv"

B"/local/path/data.csv"

C"file:///data.csv"

D"http://example.com/data.csv"

Attempts:

3 left

3fill in blank

hard

Fix the error in the SparkSession builder to run in cluster mode.

Apache Spark

spark = SparkSession.builder.master([1]).appName("ClusterApp").getOrCreate()

Drag options to blanks, or click blank then click option'

A"local"

B"local[*]"

C"yarn"

D"standalone"

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a SparkContext for local mode with 4 threads.

Apache Spark

from pyspark import SparkContext
sc = SparkContext(master=[1], appName=[2])

Drag options to blanks, or click blank then click option'

A"local[4]"

B"local"

C"LocalApp"

D"ClusterApp"

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a SparkSession for cluster mode using standalone cluster manager.

Apache Spark

spark = SparkSession.builder.master([1]).appName([2]).config("spark.submit.deployMode", [3]).getOrCreate()

Drag options to blanks, or click blank then click option'

A"spark://master:7077"

B"StandaloneClusterApp"

C"cluster"

D"local"

Attempts:

3 left