Week 27 - SQL Chips and Chunks Newsletter

SQL Insights : The Power of LAG ( )

1. SQL Chip: The Power of LAG()

The LAG() function is a lifesaver for time-series analysis in SQL. It lets you retrieve the value of an expression from a row that precedes the current row within a partition. Think of it as having "memory" in your query—it allows you to look back at the previous data point without needing a complex self-join. This is perfect for calculating a delta (the difference) between the current data and the last data point, such as month-over-month sales growth or daily stock price change.

2. SQL Chunk: Comparing Sales Day-to-Day

Let's imagine you have a table of daily sales figures and you want to see yesterday's sales amount right next to today's sales.

SalesDate

DailyRevenue

2025-10-01

1000

2025-10-02

1200

2025-10-03

950

SELECT
    SalesDate,
    DailyRevenue AS TodayRevenue,
    -- LAG() pulls the DailyRevenue from the row 1 position (day) prior, ordered by SalesDate.
    LAG(DailyRevenue, 1, 0.00) OVER (ORDER BY SalesDate) AS YesterdayRevenue
FROM
    DailySales;

3. SQL Challenge: Calculating the Daily Difference

Now that we have yesterday's revenue next to today's, the natural next step is to calculate the daily change. We can do this by using the LAG() function directly within a mathematical calculation.

SELECT
    SalesDate,
    DailyRevenue AS CurrentRevenue,
    LAG(DailyRevenue, 1, DailyRevenue) OVER (ORDER BY SalesDate) AS PreviousRevenue,
    -- Calculate the day-over-day change
    DailyRevenue - LAG(DailyRevenue, 1, DailyRevenue) OVER (ORDER BY SalesDate) AS RevenueChange
FROM
    DailySales
ORDER BY
    SalesDate DESC
LIMIT 10;

4. SQL Mistake: The Missing ORDER BY

A very common mistake with all window functions (including LAG() and LEAD()) is forgetting the crucial role of the ORDER BY clause inside the OVER() window.

The Mistake: Not including ORDER BY or ordering by a non-sequential column. If you don't use ORDER BY, the function doesn't know which row is "previous" or "next," making the results arbitrary (meaningless).

The Fix: Always specify the column that defines the sequence (usually a date or an ID) within the OVER() clause to ensure LAG() correctly identifies the preceding row.

SELECT
    TransactionID,
    TransactionDate,
    SalesAmount,
    -- CORRECT: The ORDER BY TransactionDate is required inside OVER() 
    -- to define the sequential path for LAG() to look backward.
    LAG(SalesAmount) OVER (ORDER BY TransactionDate ASC) AS PreviousSaleAmount
FROM
    SalesTransactions
ORDER BY
    TransactionDate;

5. SQL in Practice: E-Commerce Inventory Tracking

In E-commerce, inventory managers need to track stock levels to avoid "stockouts." LAG() helps track when a product was last replenished or how much stock was on hand before the most recent sale or restock event.

Imagine an InventoryLog table that tracks every change for every product.

SELECT
    LogID,
    ProductID,
    EventDate,
    StockQuantity AS CurrentStock,
    -- Check what the stock level was just before this current log entry for the same product.
    LAG(StockQuantity, 1, 0) OVER (
        PARTITION BY ProductID -- Restart the comparison for each unique product
        ORDER BY EventDate ASC
    ) AS PreviousStockLevel
FROM
    InventoryLog
WHERE
    ProductID = 401
ORDER BY
    EventDate DESC;

6. SQL Resource: Free Advanced SQL Window Functions Tutorial

How It Helps: This free resource, maintained by a recognized SQL expert, goes beyond simple explanations and shows you exactly how and why window functions like LAG() are so performant. It’s perfect for moving past the basics and truly understanding the underlying database mechanics that make these functions powerful tools for advanced data manipulation and efficiency in your future data career. Happy learning!