0
0
Snowflakecloud~10 mins

Why Snowpark brings code to the data in Snowflake - Visual Breakdown

Choose your learning style9 modes available
Process Flow - Why Snowpark brings code to the data
User writes code locally
Code sent to Snowflake platform
Code runs inside Snowflake near data
Results sent back to user
User gets output quickly and securely
This flow shows how Snowpark moves your code to where the data lives inside Snowflake, instead of moving data to your code.
Execution Sample
Snowflake
session.sql("SELECT * FROM sales WHERE amount > 100").collect()
This code runs inside Snowflake, filtering data close to where it is stored.
Process Table
StepActionLocationData MovementResult
1User writes Snowpark codeUser machineNo data movedCode ready to send
2Code sent to SnowflakeNetworkCode moves, data staysCode received by Snowflake
3Code executes near dataSnowflake serverNo data movement neededFiltered data prepared
4Results sent backNetworkOnly filtered results moveUser receives output
5User processes resultsUser machineNo extra data movementFinal output ready
💡 Execution stops after results are returned to user, minimizing data transfer.
Status Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
CodeLocal scriptSent to SnowflakeExecutingExecutedComplete
DataIn SnowflakeIn SnowflakeFiltered in SnowflakeFiltered results sentUser received results
Key Moments - 3 Insights
Why doesn't Snowpark move all data to the user's machine?
Because moving large data is slow and costly. Snowpark runs code inside Snowflake where data lives, so only small results move back, as shown in execution_table rows 2 and 3.
How does running code near data improve speed?
It avoids delays from transferring big data over the network. execution_table row 3 shows code running inside Snowflake, filtering data before sending results.
What moves over the network when using Snowpark?
Only the code and the filtered results move, not the entire dataset. This is clear in execution_table rows 2 and 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, at which step does the code start running near the data?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Check the 'Location' and 'Action' columns in execution_table row 3.
According to variable_tracker, what happens to the data after step 3?
AData moves to user machine
BData is filtered inside Snowflake
CData is deleted
DData is copied to another server
💡 Hint
Look at the 'Data' row under 'After Step 3' in variable_tracker.
If Snowpark moved all data to the user machine, which execution_table step would change?
AStep 4
BStep 3
CStep 2
DStep 5
💡 Hint
Consider where data moves over the network in execution_table, especially step 4.
Concept Snapshot
Snowpark sends your code to Snowflake where data lives.
Code runs inside Snowflake, close to data.
Only filtered results move back to you.
This reduces data transfer and speeds up processing.
It keeps data secure and efficient.
Full Transcript
Snowpark works by sending your code to run inside Snowflake, where the data is stored. This means the data does not move to your computer. Instead, your code runs near the data, filtering or processing it there. Then, only the small results are sent back to you. This approach saves time and network resources, making data processing faster and safer. The execution flow starts with you writing code, sending it to Snowflake, running it near the data, and finally receiving the results.