Bird
Raised Fist0
Snowflakecloud~10 mins

Window functions in Snowflake - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Process Flow - Window functions in Snowflake
Start Query
Select Data
Define Window
Apply Window Function
Calculate Result per Row
Return Result Set
The query starts by selecting data, then defines a window (a group of rows), applies the window function to each row within that window, calculates results, and returns the final result set.
Execution Sample
Snowflake
SELECT
  employee_id,
  department,
  salary,
  AVG(salary) OVER (PARTITION BY department ORDER BY employee_id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS avg_salary
FROM employees;
This query calculates the running average salary per department ordered by employee_id.
Process Table
StepRow (employee_id)Partition (department)Window FrameWindow Function Calculationavg_salary
1101SalesRows from start to current (101)AVG(salary) of [5000]5000
2102SalesRows from start to current (101,102)AVG(salary) of [5000, 6000]5500
3103SalesRows from start to current (101,102,103)AVG(salary) of [5000, 6000, 6500]5833.33
4201HRRows from start to current (201)AVG(salary) of [4500]4500
5202HRRows from start to current (201,202)AVG(salary) of [4500, 4700]4600
6301ITRows from start to current (301)AVG(salary) of [7000]7000
7302ITRows from start to current (301,302)AVG(salary) of [7000, 7200]7100
8303ITRows from start to current (301,302,303)AVG(salary) of [7000, 7200, 7100]7100
9END---All rows processed, query returns result set
💡 All rows processed, window function applied per partition and ordered rows, query completes.
Status Tracker
VariableStartAfter 1After 2After 3After 4After 5After 6After 7After 8Final
employee_id-101102103201202301302303-
department-SalesSalesSalesHRHRITITIT-
salary-50006000650045004700700072007100-
avg_salary-500055005833.3345004600700071007100-
Key Moments - 3 Insights
Why does the average salary change only within each department and not across all employees?
Because the window function uses PARTITION BY department, it calculates averages separately for each department partition as shown in execution_table rows 1-3 for Sales, 4-5 for HR, and 6-8 for IT.
What does the window frame 'ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW' mean?
It means the function calculates the average from the first row in the partition up to the current row, accumulating values as seen in the execution_table where the window frame grows with each row.
Why is the order by employee_id important in the window function?
Ordering by employee_id defines the sequence in which rows are considered for the running average, affecting the calculation step by step as shown in the execution_table where avg_salary changes with each ordered row.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 5. What is the avg_salary value for employee_id 202 in HR?
A4500
B4700
C4600
D4550
💡 Hint
Check the avg_salary column in execution_table row 5 for employee_id 202.
At which step does the window function finish processing all rows in the IT department?
AStep 8
BStep 7
CStep 6
DStep 9
💡 Hint
Look at the partition column and see when IT department rows end in execution_table.
If we remove ORDER BY employee_id from the window clause, how would the avg_salary values change?
AThey would be calculated over the entire partition without order, so running average would not accumulate.
BThe avg_salary would be the overall average per department for all rows.
CThe query would fail with an error.
DThey would be the same as now.
💡 Hint
Consider how ORDER BY affects the window frame and calculation in execution_table steps.
Concept Snapshot
Window functions in Snowflake:
- Use OVER() clause with PARTITION BY and ORDER BY
- PARTITION BY groups rows like departments
- ORDER BY defines row sequence inside partitions
- Window frame defines rows considered per calculation
- Functions compute values per row over the window
- Useful for running totals, averages, ranks without grouping
Full Transcript
This visual execution traces a Snowflake window function calculating running average salary per department. The query selects employee data and applies AVG() over a window partitioned by department and ordered by employee_id. Each step shows how the window frame grows from the first row to the current row in the partition, updating the average salary. Variables track employee_id, department, salary, and the computed avg_salary. Key moments clarify partitioning, window frame meaning, and ordering importance. The quiz tests understanding of values at specific steps and effects of removing ORDER BY. The snapshot summarizes syntax and behavior for quick reference.

Practice

(1/5)
1. What does a window function in Snowflake do?
easy
A. Calculates values across rows related to the current row without grouping them into fewer rows
B. Groups rows and reduces the number of rows returned
C. Deletes duplicate rows from the result set
D. Creates a new table from existing data

Solution

  1. Step 1: Understand window function purpose

    Window functions perform calculations across a set of rows related to the current row but do not reduce the number of rows returned.
  2. Step 2: Compare with grouping

    Unlike GROUP BY, window functions keep all rows visible while calculating values like running totals or ranks.
  3. Final Answer:

    Calculates values across rows related to the current row without grouping them into fewer rows -> Option A
  4. Quick Check:

    Window functions analyze rows without grouping = A [OK]
Hint: Window functions keep all rows, unlike GROUP BY [OK]
Common Mistakes:
  • Confusing window functions with GROUP BY aggregation
  • Thinking window functions reduce row count
  • Assuming window functions delete duplicates
2. Which of the following is the correct syntax to calculate a running total of sales using a window function in Snowflake?
easy
A. SELECT SUM(sales) GROUP BY region ORDER BY date FROM sales_data;
B. SELECT sales + PREVIOUS(sales) FROM sales_data;
C. SELECT RUNNING_TOTAL(sales) FROM sales_data;
D. SELECT SUM(sales) OVER (PARTITION BY region ORDER BY date) FROM sales_data;

Solution

  1. Step 1: Identify correct window function syntax

    SUM(sales) OVER (PARTITION BY region ORDER BY date) correctly calculates a running total partitioned by region and ordered by date.
  2. Step 2: Eliminate incorrect options

    SELECT SUM(sales) GROUP BY region ORDER BY date FROM sales_data; uses GROUP BY which reduces rows, not a window function. Options C and D use invalid functions or syntax.
  3. Final Answer:

    SELECT SUM(sales) OVER (PARTITION BY region ORDER BY date) FROM sales_data; -> Option D
  4. Quick Check:

    SUM() OVER with PARTITION BY and ORDER BY = B [OK]
Hint: Look for SUM() OVER with PARTITION BY and ORDER BY [OK]
Common Mistakes:
  • Using GROUP BY instead of OVER clause
  • Using non-existent functions like RUNNING_TOTAL
  • Omitting ORDER BY in window function
3. Given the table sales with columns region, date, and amount, what is the output of this query?
SELECT region, date, amount, RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank FROM sales;
medium
A. Ranks sales amounts within each region from highest to lowest
B. Ranks sales amounts across all regions ignoring region groups
C. Calculates cumulative sum of amounts per region
D. Returns the total number of sales per region

Solution

  1. Step 1: Understand RANK() with PARTITION BY and ORDER BY

    RANK() assigns ranks starting at 1 within each partition (region), ordering by amount descending.
  2. Step 2: Interpret the query output

    The query shows each sale with its rank in its region based on amount, highest amount ranked 1.
  3. Final Answer:

    Ranks sales amounts within each region from highest to lowest -> Option A
  4. Quick Check:

    RANK() OVER PARTITION BY region ORDER BY amount DESC = A [OK]
Hint: RANK() with PARTITION BY ranks within groups [OK]
Common Mistakes:
  • Thinking RANK() ignores PARTITION BY
  • Confusing RANK() with cumulative sum
  • Assuming ranks are across all rows without grouping
4. Identify the error in this Snowflake query:
SELECT employee_id, salary, ROW_NUMBER() OVER (ORDER BY salary) PARTITION BY department FROM employees;
medium
A. ORDER BY cannot be used in window functions
B. ROW_NUMBER() cannot be used with ORDER BY
C. PARTITION BY must come before ORDER BY inside OVER()
D. Missing GROUP BY clause for department

Solution

  1. Step 1: Check window function clause order

    In Snowflake, PARTITION BY must appear before ORDER BY inside the OVER() clause.
  2. Step 2: Identify syntax error

    The query places PARTITION BY after ORDER BY, which is invalid syntax.
  3. Final Answer:

    PARTITION BY must come before ORDER BY inside OVER() -> Option C
  4. Quick Check:

    PARTITION BY before ORDER BY in OVER() = D [OK]
Hint: PARTITION BY always before ORDER BY in OVER() [OK]
Common Mistakes:
  • Placing PARTITION BY after ORDER BY
  • Thinking ROW_NUMBER() disallows ORDER BY
  • Adding unnecessary GROUP BY for window functions
5. You want to calculate the average sales per region and also show each sale's rank by amount within its region. Which query correctly combines these using window functions?
hard
A. SELECT region, amount, AVG(amount) PARTITION BY region, RANK() ORDER BY amount DESC FROM sales;
B. SELECT region, amount, AVG(amount) OVER (PARTITION BY region) AS avg_region, RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank FROM sales;
C. SELECT region, amount, AVG(amount), RANK() FROM sales GROUP BY region ORDER BY amount DESC;
D. SELECT region, amount, AVG(amount) OVER (), RANK() OVER (ORDER BY amount) FROM sales;

Solution

  1. Step 1: Use AVG() as window function partitioned by region

    AVG(amount) OVER (PARTITION BY region) calculates average sales per region without grouping rows.
  2. Step 2: Use RANK() partitioned by region ordered by amount descending

    RANK() OVER (PARTITION BY region ORDER BY amount DESC) ranks sales within each region.
  3. Step 3: Verify query correctness

    SELECT region, amount, AVG(amount) OVER (PARTITION BY region) AS avg_region, RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank FROM sales; correctly uses window functions with proper syntax and clauses.
  4. Final Answer:

    SELECT region, amount, AVG(amount) OVER (PARTITION BY region) AS avg_region, RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank FROM sales; -> Option B
  5. Quick Check:

    AVG() and RANK() with PARTITION BY region = C [OK]
Hint: Use OVER(PARTITION BY region) for both AVG and RANK [OK]
Common Mistakes:
  • Using GROUP BY instead of window functions
  • Incorrect syntax for window functions
  • Omitting PARTITION BY for per-region calculations