Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to create a Spark session in Databricks.
Apache Spark
from pyspark.sql import SparkSession spark = SparkSession.builder.appName([1]).getOrCreate()
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Forgetting to put the app name in quotes.
Using a variable name without quotes.
✗ Incorrect
The appName method requires a string argument to name the Spark application.
2fill in blank
mediumComplete the code to read a CSV file into a DataFrame in Databricks.
Apache Spark
df = spark.read.format([1]).option("header", "true").load("/mnt/data/sample.csv")
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using the wrong format like "json" or "parquet".
Not putting the format name in quotes.
✗ Incorrect
To read a CSV file, the format must be set to "csv".
3fill in blank
hardFix the error in the code to display the first 5 rows of the DataFrame.
Apache Spark
df.[1](5).show()
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using head(5) which returns a list, not a DataFrame.
Using 'show', which returns None and causes an AttributeError when chaining .show().
✗ Incorrect
The limit method limits the DataFrame to the first n rows and returns a DataFrame, so show() can be called after it.
4fill in blank
hardFill both blanks to create a DataFrame with only rows where the age is greater than 30.
Apache Spark
filtered_df = df.filter(df.[1] [2] 30)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using the wrong column name like 'salary'.
Using the less than operator '<' instead of '>'.
✗ Incorrect
To filter rows where age is greater than 30, use df.age > 30 inside filter().
5fill in blank
hardFill all three blanks to create a dictionary comprehension that maps each word to its length if the length is greater than 3.
Apache Spark
lengths = { [1]: [2] for [3] in words if len([3]) > 3 } Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using different variable names inconsistently.
Mapping key to the wrong value.
✗ Incorrect
The comprehension uses 'word' as the variable, maps word to len(word), and iterates over words with 'word'.