Challenge - 5 Problems

🎖️

Snowpark DataFrame Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ service_behavior

intermediate

2:00remaining

Understanding DataFrame Lazy Evaluation in Snowpark

Consider the following Snowpark code snippet:

df = session.table("EMPLOYEES").filter(col("SALARY") > 50000)
df.count()

What happens when df.count() is called?

Snowflake

df = session.table("EMPLOYEES").filter(col("SALARY") > 50000)
df.count()

AThe filter operation runs immediately and returns the count of employees with salary over 50000.

BNo query runs until an action like count() is called; count() triggers the query execution and returns the number of matching rows.

CThe count() method returns the total number of rows in the EMPLOYEES table ignoring the filter.

DThe filter operation runs immediately but count() only returns a DataFrame object without executing the query.

Attempts:

2 left

❓ Configuration

intermediate

2:00remaining

Correctly Creating a Snowpark DataFrame from a SQL Query

You want to create a DataFrame from a SQL query string in Snowpark. Which code snippet correctly creates the DataFrame?

Adf = session.sql("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")

Bdf = session.table("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")

Cdf = session.execute("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")

Ddf = session.query("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")

Attempts:

2 left

❓ Architecture

advanced

2:00remaining

Optimizing Snowpark DataFrame Operations for Large Datasets

You have a large dataset and want to optimize your Snowpark DataFrame transformations to minimize data scanned and improve performance. Which approach is best?

AUse multiple collect() calls to fetch intermediate results to speed up processing.

BLoad the entire dataset into memory before applying any filters or transformations.

CApply filters early in the DataFrame chain to reduce data processed in later steps.

DAvoid using filters and rely on Snowflake's automatic optimization only.

Attempts:

2 left

❓ security

advanced

2:00remaining

Handling Sensitive Data with Snowpark DataFrames

You need to process sensitive customer data using Snowpark DataFrames. Which practice ensures data security during processing?

AUse Snowflake's masking policies and restrict DataFrame access to authorized roles only.

BExport the data to local files, mask it there, then reload into Snowpark DataFrames.

CDisable all Snowflake security features and rely on Snowpark's client-side encryption.

DShare the DataFrame with all users to speed up processing and mask data manually.

Attempts:

2 left

✅ Best Practice

expert

2:00remaining

Ensuring Efficient Resource Usage with Snowpark Sessions

You have multiple Snowpark DataFrame operations running in parallel in your application. What is the best practice to manage Snowpark sessions to optimize resource usage?

AAvoid using Snowpark sessions and run raw SQL queries instead.

BCreate a new Snowpark session for each DataFrame operation to isolate workloads.

CClose the Snowpark session immediately after creating each DataFrame to free resources.

DCreate a single shared Snowpark session and reuse it across all operations to reduce overhead.

Attempts:

2 left