0
0
PostgreSQLquery~15 mins

Window frame (ROWS BETWEEN, RANGE BETWEEN) in PostgreSQL - Deep Dive

Choose your learning style9 modes available
Overview - Window frame (ROWS BETWEEN, RANGE BETWEEN)
What is it?
Window frames define which rows are included when calculating window functions in SQL. They specify a subset of rows around the current row to perform calculations like sums or averages. The two main types are ROWS BETWEEN and RANGE BETWEEN, which differ in how they select rows. This helps analyze data with context, like running totals or moving averages.
Why it matters
Without window frames, window functions would always consider all rows in a partition, making it impossible to calculate running totals or moving averages efficiently. Window frames let you focus calculations on a specific range of rows, enabling powerful, flexible data analysis. This makes reports and insights more meaningful and precise.
Where it fits
Before learning window frames, you should understand basic SQL SELECT queries, aggregation functions, and window functions without frames. After mastering window frames, you can explore advanced analytics like cumulative distributions, lag/lead functions, and performance tuning of window queries.
Mental Model
Core Idea
A window frame selects a moving subset of rows around the current row to apply window functions, controlling the calculation's scope.
Think of it like...
Imagine you are reading a book and highlighting sentences. Instead of highlighting the whole chapter, you highlight only a few sentences before and after the current one to understand context. The window frame is like choosing how many sentences around your current sentence to include.
Partition of rows:
┌─────────────┐
│ Row 1       │
│ Row 2       │
│ Row 3  ◀── Current row
│ Row 4       │
│ Row 5       │
└─────────────┘

ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING includes Row 2, Row 3, Row 4
RANGE BETWEEN 10 PRECEDING AND CURRENT ROW includes rows with values within 10 units before current row's value
Build-Up - 7 Steps
1
FoundationIntroduction to Window Functions
🤔
Concept: Window functions perform calculations across a set of table rows related to the current row.
Window functions like SUM(), AVG(), ROW_NUMBER() compute values over partitions of data without collapsing rows. For example, SUM(sales) OVER (PARTITION BY region) calculates total sales per region but keeps all rows.
Result
You get a new column with aggregated values repeated for each row in the partition.
Understanding window functions is essential because window frames control which rows these functions consider, enabling more precise calculations.
2
FoundationBasic Window Frame Concept
🤔
Concept: Window frames limit the rows considered by a window function to a subset around the current row.
By default, window functions consider all rows in the partition. Adding a frame clause like ROWS BETWEEN 1 PRECEDING AND CURRENT ROW restricts the calculation to the current row and one before it.
Result
Calculations like running totals or moving averages become possible by focusing on a sliding window of rows.
Knowing that window frames define the calculation scope helps you control how data is aggregated or analyzed in context.
3
IntermediateROWS BETWEEN Frame Explained
🤔Before reading on: do you think ROWS BETWEEN counts rows by position or by value? Commit to your answer.
Concept: ROWS BETWEEN defines the frame by counting physical row positions relative to the current row.
For example, ROWS BETWEEN 2 PRECEDING AND 1 FOLLOWING includes the two rows before and one row after the current row, regardless of their values. It is a fixed number of rows.
Result
You get a frame that moves row by row, always including the same number of rows around the current one.
Understanding ROWS BETWEEN as a positional window helps you predict exactly which rows are included, useful for fixed-size moving calculations.
4
IntermediateRANGE BETWEEN Frame Explained
🤔Before reading on: does RANGE BETWEEN select rows by position or by value range? Commit to your answer.
Concept: RANGE BETWEEN defines the frame by value ranges relative to the current row's ordering column.
For example, RANGE BETWEEN 10 PRECEDING AND CURRENT ROW includes all rows with values within 10 units less than or equal to the current row's value. It depends on the ORDER BY column's values, not row count.
Result
You get a frame that can include varying numbers of rows depending on data distribution and values.
Knowing RANGE BETWEEN selects rows by value range helps you handle cases where data values matter more than row positions, like time intervals.
5
IntermediateDifferences Between ROWS and RANGE
🤔
Concept: ROWS counts physical rows; RANGE counts rows based on value ranges in the ORDER BY column.
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW always includes exactly 3 rows (if available). RANGE BETWEEN 10 PRECEDING AND CURRENT ROW includes all rows with values within 10 units before current row's value, which can be more or fewer rows.
Result
You can choose the frame type based on whether you want fixed-size windows or value-based windows.
Recognizing the difference prevents unexpected results, especially with duplicate or unevenly spaced values.
6
AdvancedUsing UNBOUNDED Frames for Aggregates
🤔Before reading on: does UNBOUNDED mean no limit or zero rows? Commit to your answer.
Concept: UNBOUNDED PRECEDING or FOLLOWING means the frame extends to the start or end of the partition.
For example, ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW calculates a running total from the first row up to the current row. This is common for cumulative sums.
Result
You get cumulative aggregates that grow as you move through the partition.
Understanding UNBOUNDED frames enables powerful cumulative calculations essential for trends and time series analysis.
7
ExpertSubtleties and Surprises in RANGE Frames
🤔Before reading on: do you think RANGE BETWEEN works with multiple ORDER BY columns? Commit to your answer.
Concept: RANGE BETWEEN has limitations and behaves differently with multiple ORDER BY columns and non-numeric types.
In PostgreSQL, RANGE BETWEEN supports only one ORDER BY column. Also, RANGE frames treat NULLs and duplicates specially, sometimes including unexpected rows. This can cause confusion if not understood.
Result
Misusing RANGE can lead to incorrect frame sizes and wrong results in window functions.
Knowing RANGE's limitations and behavior prevents subtle bugs and helps choose the right frame type for complex queries.
Under the Hood
PostgreSQL processes window functions by first sorting rows according to PARTITION BY and ORDER BY clauses. Then, for each row, it determines the frame boundaries based on the frame clause (ROWS or RANGE). ROWS counts physical rows relative to the current row's position. RANGE calculates frame boundaries by comparing ORDER BY column values, including all rows with values within the specified range. The function is then applied to the rows inside this frame.
Why designed this way?
The design separates logical partitioning and ordering from frame specification to allow flexible, efficient calculations. ROWS provides precise control by position, useful for fixed-size windows. RANGE allows value-based frames, useful for time or numeric ranges. This dual approach balances expressiveness and performance. Alternatives like only positional frames would limit use cases; only value-based frames would be less predictable and harder to optimize.
┌─────────────────────────────┐
│ Partition (sorted by ORDER) │
├─────────────────────────────┤
│ Row 1 (value=10)            │
│ Row 2 (value=15)            │
│ Row 3 (value=20) ◀ current  │
│ Row 4 (value=25)            │
│ Row 5 (value=30)            │
└─────────────────────────────┘

ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING frame:
  Includes Row 2, Row 3, Row 4 (positions)

RANGE BETWEEN 5 PRECEDING AND CURRENT ROW frame:
  Includes rows with values >= 15 (20-5) and <= 20, so Row 2 and Row 3
Myth Busters - 4 Common Misconceptions
Quick: Does RANGE BETWEEN always include the same number of rows as ROWS BETWEEN? Commit yes or no.
Common Belief:RANGE BETWEEN works just like ROWS BETWEEN but uses value ranges instead of positions, so they behave similarly.
Tap to reveal reality
Reality:RANGE BETWEEN can include a variable number of rows depending on data values and duplicates, unlike ROWS which always counts fixed row positions.
Why it matters:Assuming they behave the same can cause wrong calculations, especially with uneven data or duplicates, leading to incorrect reports.
Quick: Can RANGE BETWEEN be used with multiple ORDER BY columns? Commit yes or no.
Common Belief:You can use RANGE BETWEEN with multiple ORDER BY columns to define complex frames.
Tap to reveal reality
Reality:PostgreSQL only supports RANGE BETWEEN with a single ORDER BY column; using multiple columns causes errors or unexpected behavior.
Why it matters:Trying to use RANGE with multiple columns can break queries or produce wrong results, wasting time debugging.
Quick: Does UNBOUNDED PRECEDING mean the frame includes no rows before the current? Commit yes or no.
Common Belief:UNBOUNDED PRECEDING means no rows are included before the current row.
Tap to reveal reality
Reality:UNBOUNDED PRECEDING means the frame starts at the very first row of the partition, including all rows before the current one.
Why it matters:Misunderstanding UNBOUNDED leads to incorrect cumulative calculations and wrong aggregates.
Quick: Does the default frame for window functions include all rows in the partition? Commit yes or no.
Common Belief:By default, window functions consider only the current row unless a frame is specified.
Tap to reveal reality
Reality:The default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW for ordered window functions, including all rows up to the current one.
Why it matters:Not knowing the default frame can cause confusion when results seem cumulative without explicit frame clauses.
Expert Zone
1
RANGE frames can behave unexpectedly with non-numeric ORDER BY columns, like dates or text, due to how value ranges are calculated.
2
Using ROWS frames with large partitions can impact performance because the database must track physical row positions precisely.
3
The interaction of frame clauses with window function FILTER clauses can produce subtle differences in results.
When NOT to use
Avoid RANGE BETWEEN when your ORDER BY column has duplicates or non-numeric types that complicate value ranges; use ROWS BETWEEN instead. For very large datasets where performance is critical, consider pre-aggregating data or using indexed materialized views instead of complex window frames.
Production Patterns
Common patterns include running totals with ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, moving averages with ROWS BETWEEN N PRECEDING AND CURRENT ROW, and time-based frames with RANGE BETWEEN intervals on timestamp columns. Experts also combine frames with PARTITION BY to segment data by groups like regions or categories.
Connections
Time Series Analysis
Window frames enable calculations like moving averages and cumulative sums essential in time series data.
Understanding window frames helps grasp how time-based trends and rolling metrics are computed efficiently in databases.
Sliding Window Algorithms (Computer Science)
Window frames in SQL are a declarative form of sliding window algorithms that process data subsets incrementally.
Recognizing this connection clarifies how databases optimize repeated calculations over moving data windows.
Signal Processing
Window frames conceptually resemble signal processing windows that isolate parts of a signal for analysis.
Knowing this cross-domain similarity reveals how framing data is a universal technique for focused analysis.
Common Pitfalls
#1Using RANGE BETWEEN with multiple ORDER BY columns causes errors.
Wrong approach:SELECT SUM(sales) OVER (ORDER BY date, region RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) FROM sales_data;
Correct approach:SELECT SUM(sales) OVER (ORDER BY date RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) FROM sales_data;
Root cause:Misunderstanding that PostgreSQL supports RANGE frames only with a single ORDER BY column.
#2Assuming ROWS BETWEEN counts rows by value instead of position.
Wrong approach:SELECT SUM(amount) OVER (ORDER BY id ROWS BETWEEN 10 PRECEDING AND CURRENT ROW) FROM transactions WHERE id IN (5,6,7); -- expecting rows with id values 5 to 15
Correct approach:SELECT SUM(amount) OVER (ORDER BY id ROWS BETWEEN 10 PRECEDING AND CURRENT ROW) FROM transactions; -- counts 10 physical rows before current
Root cause:Confusing ROWS frame as value-based rather than position-based.
#3Omitting frame clause and expecting no cumulative effect.
Wrong approach:SELECT SUM(sales) OVER (PARTITION BY region ORDER BY date) FROM sales_data; -- expecting sum only for current row
Correct approach:SELECT SUM(sales) OVER (PARTITION BY region ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM sales_data;
Root cause:Not knowing the default frame includes all rows up to current, causing cumulative sums.
Key Takeaways
Window frames control which rows a window function uses for each calculation, enabling flexible, context-aware analysis.
ROWS BETWEEN counts rows by physical position relative to the current row, providing fixed-size windows.
RANGE BETWEEN selects rows based on value ranges in the ORDER BY column, which can include variable numbers of rows.
Understanding the differences and limitations of ROWS and RANGE frames prevents common bugs and unexpected results.
Expert use of window frames powers advanced analytics like running totals, moving averages, and time-based calculations in SQL.