Why data sharing eliminates data copies in Snowflake - Performance Analysis
We want to understand how the work grows when using data sharing instead of copying data.
Specifically, how many operations happen as data size grows when sharing data.
Analyze the time complexity of sharing data instead of copying it.
-- Create a share
CREATE SHARE my_share;
-- Add a database to the share
ALTER SHARE my_share ADD DATABASE my_database;
-- Consumer creates a database from the share
CREATE DATABASE shared_db FROM SHARE provider_account.my_share;
This sequence shares a database without copying data, allowing access without duplication.
Look at what happens repeatedly when sharing data.
- Primary operation: Granting access to data via share metadata.
- How many times: Once per share setup, regardless of data size.
Sharing data does not copy data, so operations stay almost the same as data grows.
| Input Size (n) | Approx. API Calls/Operations |
|---|---|
| 10 | 3 (create share, add database, create consumer DB) |
| 100 | 3 (same operations, no extra copies) |
| 1000 | 3 (still just setup calls, no data duplication) |
Pattern observation: The number of operations stays constant, not growing with data size.
Time Complexity: O(1)
This means the work to share data stays the same no matter how big the data is.
[X] Wrong: "Sharing data copies all the data behind the scenes."
[OK] Correct: Sharing only grants access pointers, so no data is duplicated or copied.
Understanding how sharing avoids copying helps you explain efficient data access in cloud systems.
"What if data sharing required copying data to the consumer account? How would the time complexity change?"