0
0
Snowflakecloud~20 mins

DataFrame API in Snowpark in Snowflake - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Snowpark DataFrame Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
service_behavior
intermediate
2:00remaining
Understanding DataFrame Lazy Evaluation in Snowpark

Consider the following Snowpark code snippet:

df = session.table("EMPLOYEES").filter(col("SALARY") > 50000)
df.count()

What happens when df.count() is called?

Snowflake
df = session.table("EMPLOYEES").filter(col("SALARY") > 50000)
df.count()
AThe filter operation runs immediately and returns the count of employees with salary over 50000.
BNo query runs until an action like count() is called; count() triggers the query execution and returns the number of matching rows.
CThe count() method returns the total number of rows in the EMPLOYEES table ignoring the filter.
DThe filter operation runs immediately but count() only returns a DataFrame object without executing the query.
Attempts:
2 left
💡 Hint

Think about when Snowpark sends queries to Snowflake.

Configuration
intermediate
2:00remaining
Correctly Creating a Snowpark DataFrame from a SQL Query

You want to create a DataFrame from a SQL query string in Snowpark. Which code snippet correctly creates the DataFrame?

Adf = session.sql("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")
Bdf = session.table("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")
Cdf = session.execute("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")
Ddf = session.query("SELECT * FROM EMPLOYEES WHERE DEPARTMENT = 'SALES'")
Attempts:
2 left
💡 Hint

Check which method accepts raw SQL strings.

Architecture
advanced
2:00remaining
Optimizing Snowpark DataFrame Operations for Large Datasets

You have a large dataset and want to optimize your Snowpark DataFrame transformations to minimize data scanned and improve performance. Which approach is best?

AUse multiple collect() calls to fetch intermediate results to speed up processing.
BLoad the entire dataset into memory before applying any filters or transformations.
CApply filters early in the DataFrame chain to reduce data processed in later steps.
DAvoid using filters and rely on Snowflake's automatic optimization only.
Attempts:
2 left
💡 Hint

Think about reducing data volume early.

security
advanced
2:00remaining
Handling Sensitive Data with Snowpark DataFrames

You need to process sensitive customer data using Snowpark DataFrames. Which practice ensures data security during processing?

AUse Snowflake's masking policies and restrict DataFrame access to authorized roles only.
BExport the data to local files, mask it there, then reload into Snowpark DataFrames.
CDisable all Snowflake security features and rely on Snowpark's client-side encryption.
DShare the DataFrame with all users to speed up processing and mask data manually.
Attempts:
2 left
💡 Hint

Consider built-in Snowflake security features.

Best Practice
expert
2:00remaining
Ensuring Efficient Resource Usage with Snowpark Sessions

You have multiple Snowpark DataFrame operations running in parallel in your application. What is the best practice to manage Snowpark sessions to optimize resource usage?

AAvoid using Snowpark sessions and run raw SQL queries instead.
BCreate a new Snowpark session for each DataFrame operation to isolate workloads.
CClose the Snowpark session immediately after creating each DataFrame to free resources.
DCreate a single shared Snowpark session and reuse it across all operations to reduce overhead.
Attempts:
2 left
💡 Hint

Think about session overhead and resource management.