- SQL Chips and Chunks Newsletter
- Posts
- Week 29 - SQL Chips and Chunks Newsletter
Week 29 - SQL Chips and Chunks Newsletter
SQL Insights : The Power of ROWS and RANGE in Window Functions
1. SQL Chip: Window Function Power Moves
The OVER clause defines the window of rows for a function (like SUM() or AVG()) to operate on. Most often, we use PARTITION BY and ORDER BY for this. However, to calculate things like moving averages or cumulative totals, you need to define a smaller frame within that window using ROWS or RANGE.
ROWS: Counts rows. It specifies a fixed number of rows before and after the current row. For example,ROWS BETWEEN 5 PRECEDING AND CURRENT ROWincludes the current row and the 5 preceding rows.RANGE: Counts values. It specifies a range of values in theORDER BYcolumn. This is less common but useful for things like finding all rows within a certain time or price difference from the current row.
2. SQL Chunk: Calculating a 7-Day Moving Average
Let's use a sample sales table to calculate a 7-day moving average of daily sales, which smooths out daily fluctuations to reveal underlying trends.
SELECT
sale_date,
daily_revenue,
AVG(daily_revenue) OVER (
ORDER BY sale_date
-- Define the window frame: 6 preceding rows and the current row (7 days total)
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS seven_day_moving_avg
FROM
sales
ORDER BY
sale_date
LIMIT 10;
3. SQL Challenge: Cumulative Total and Percentile Rank by Category
We'll combine partitioning, ordering, and the unbounded frame to calculate a cumulative total of inventory items within each product category. We'll also calculate a percentile rank of items based on their price, a key analysis for category managers.
Imagine a table called inventory_item with category, item_price, and stock_count.
SELECT
category,
item_price,
stock_count,
-- Cumulative Stock Count: Sums all stock_counts up to the current row within each category.
SUM(stock_count) OVER (
PARTITION BY category
ORDER BY item_price
-- Defines the frame from the very first row up to the current row.
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_category_stock,
-- Percentile Rank: Shows where the item's price falls relative to others in its category (0 to 1).
PERCENT_RANK() OVER (
PARTITION BY category
ORDER BY item_price
) AS price_percentile_rank
FROM
inventory_item
ORDER BY
category,
item_price DESC;
4. SQL Mistake: Misusing GROUP BY for Cumulative Totals
A very common mistake is trying to force a cumulative total using only GROUP BY and a subquery. This results in incorrect or extremely slow SQL.
❌ The Mistake: Incorrect Grouped Subquery
-- This query attempts to get a running total without a window function.
-- It's computationally expensive and often results in a Cartesian join-like performance issue for large data.
SELECT
s1.sale_date,
s1.daily_revenue,
SUM(s2.daily_revenue) AS cumulative_revenue_mistake
FROM
sales s1
JOIN
sales s2 ON s2.sale_date <= s1.sale_date -- This is the slow, self-join for a running total.
GROUP BY
s1.sale_date,
s1.daily_revenue
ORDER BY
s1.sale_date;
✅ The Fix: The Efficiency of the Window Function
The SUM() OVER (ORDER BY ... ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) from Section 3 is the correct, efficient, and clean way to handle cumulative sums. The SQL engine is highly optimized for this syntax.
5. SQL in Practice: Finance - Quarterly Investment Performance
In the Finance sector, calculating percentiles is critical for evaluating investment performance. A fund manager needs to know how their fund's return compares to all other similar funds.
Imagine a table fund_performance with fund_id, quarter, and qtr_return_pct. We can calculate the fund's percentile ranking within all funds for a given quarter.
SELECT
fund_id,
quarter,
qtr_return_pct,
-- NTILE(100) divides the returns into 100 buckets (percentiles), ranking funds from 1 to 100.
NTILE(100) OVER (
PARTITION BY quarter -- Rank only against funds in the same quarter
ORDER BY qtr_return_pct DESC -- Highest return gets the highest rank
) AS performance_percentile_rank
FROM
fund_performance
WHERE
quarter = '2024-Q3'
ORDER BY
performance_percentile_rank DESC,
qtr_return_pct DESC;
This query instantly tells the manager, for example, "Fund X is in the 95th percentile for this quarter," a crucial metric.
6. SQL Resource: YouTube Deep Dive
The Ultimate SQL Window Functions Course (Frame Clause Deep Dive)
This YouTube tutorial offers a dedicated, clear visual explanation of the frame clause, including ROWS and RANGE, which is the core advanced concept this week. It visually walks you through how ROWS BETWEEN counts physical rows for things like moving averages, and how RANGE BETWEEN groups rows by value for more complex analysis. Mastering this distinction is what truly sets advanced SQL users apart and cements the power moves we discussed.
You can find this resource by searching on YouTube for: "The Ultimate SQL Window Functions Course | Complete End-to-End Guide"