Window functions in Snowflake - Time & Space Complexity
When using window functions in Snowflake, it's important to know how the work grows as your data grows.
We want to understand how the number of operations changes when the input data size increases.
Analyze the time complexity of the following operation sequence.
SELECT
user_id,
order_date,
amount,
SUM(amount) OVER (PARTITION BY user_id ORDER BY order_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM orders;
This query calculates a running total of order amounts for each user, ordered by date.
Identify the API calls, resource provisioning, data transfers that repeat.
- Primary operation: Calculating the running total for each row using the window function.
- How many times: Once per row in the input data.
As the number of rows grows, the work to compute running totals grows roughly in proportion to the number of rows.
| Input Size (n) | Approx. Api Calls/Operations |
|---|---|
| 10 | About 10 running total calculations |
| 100 | About 100 running total calculations |
| 1000 | About 1000 running total calculations |
Pattern observation: The number of operations grows linearly with the number of rows.
Time Complexity: O(n)
This means the time to compute the running totals grows directly with the number of rows.
[X] Wrong: "Window functions always take much longer than simple queries because they do a lot more work."
[OK] Correct: While window functions do extra calculations, Snowflake optimizes them so the work grows linearly, not exponentially, with data size.
Understanding how window functions scale helps you explain query performance clearly and shows you know how databases handle data efficiently.
"What if we changed the window frame from UNBOUNDED PRECEDING to a fixed number of rows? How would the time complexity change?"