26th May 2021 8 minutes read How to Compute Year-Over-Year Differences in SQL Ignacio L. Bisso Window Functions Table of Contents What Are Year-Over-Year Metrics From a Business Perspective? How to Calculate YOY metrics with SQL Calculating Month-to-Month and Quarter-to-Quarter Differences What’s Next? How did your business do this year compared to last year? The year before? Find out how to use SQL to calculate year to year and month to month differences. Year-over-year (YOY) comparisons are a popular and effective way to evaluate the performance of several kinds of organizations. Year to year differences are easy to understand – e.g. it’s easy to understand that your revenue growth is 35% year to year. However, calculating this metric in SQL is not so easy! In this article, we are going to learn how to calculate year-to-year and month-over-month differences using SQL. Are you ready to invest in your SQL skills? Let’s go! What Are Year-Over-Year Metrics From a Business Perspective? To know if a business organization is performing well, we usually need to do a comparison based on business metrics. Sometimes we can compare against similar companies or with the average of several companies in the same market. However, one key performance indicator is periodic growth: the comparison of your company’s current results against the same metric from a previous time period. This is the reason for year-to-year comparisons: You can easily see if your organization is doing better (or worse) than last year. Moreover, year-to-year comparisons can be applied to different metrics (e.g. sales, profits, number of customers) to better understand how different business indicators are evolving. In the following table, we can see the results for 2019 and 2020: Metrics20192020 Revenue$4 300 000$4 800 000 Costs$1 700 000$2 600 000 Profit60%45% Number of customers12 00012 200 If we add a third column called YOY, we can easily see the variation from year to year. This lets us better understand how our organization performed in the last year and what areas can be improved: Metrics20192020YOY Revenue$4 300 000$4 800 000$500 000 Costs$1 700 000$2 600 000$900 000 Profit60%45%-15% Number of customers12 00012 200200 In this article, we will explain how we can calculate the values in the YOY column. Let’s start by showing the base table, where we have the metrics for 2019 and 2020. In the next image, we can see these values in the yearly_metrics table. YearRevenueCostProfitNumber_of_customers 2019430000017000006012000 2020480000026000004512200 202118000007500005812280 In the next section, we’ll explain how to query the yearly_metrics table using SQL to obtain the values in the YOY column. Before that, I would like to suggest LearnSQL.com’s Window Functions course, where you can learn the basics of SQL window functions. Also, the article When Do I Use SQL Window Functions provides a lot of example queries that you may find helpful. How to Calculate YOY metrics with SQL Window functions are a very powerful SQL feature. They return the result of applying a function (like MAX(), AVG() or COUNT()) to a set of records (which is called the “window”) in a table. The set of records is defined by the OVER() clause; this clause is mandatory for window function. You can also order the records in the window by different criteria and then use functions like FIRST_VALUE(), LAST_VALUE(), LEAD(), or LAG() to return the value from specific records in relation to the current record. If you want to review the basics of window functions, I suggest the article What Is the Over Clause?. Understanding SQL window functions and the OVER clause will help you with the concepts we’ll discuss in this article. Now, let’s see an example of a SQL query that returns the revenue for each year and its preceding year: SELECT year, revenue, LAG(revenue) OVER ( ORDER BY year ) AS Revenue_Previous_Year FROM yearly_metrics In this example, we are using the LAG() window function to obtain the value of the column revenue for the previous record. In other words, if the current record is for 2020, LAG(revenue) will return the value of the revenue column for the year 2019. The results of this query are: YearRevenueRevenue Previous Year 20194300000NULL 202048000004300000 202118000004800000 The next step to obtain the YOY value for revenue is simple; we only need to calculate the difference between revenue for 2020 and revenue for 2019. Here’s how we’d do that: SELECT year, revenue, LAG(revenue) OVER ( ORDER BY year ) AS Revenue_Previous_Year revenue - LAG(revenue) OVER ( ORDER BY year ) AS YOY_Difference FROM yearly_metrics We calculated the YOY Difference by doing an arithmetic difference between revenue in 2020 and revenue in 2019. In the following result, we can see the record for 2019 doesn’t have a YOY Difference value because we don’t have data for 2018. Also, the record for 2021 doesn’t have a consistent value in YOY Difference because the data for 2021 is not complete. YearRevenueRevenue Previous YearYOY Difference 20194300000NULLNULL 202048000004300000500000 202118000004800000-3000000 Before going to the next section, I’d like to suggest the article How to Calculate the Difference Between Two Rows in SQL, which goes deeper into the calculation of differences using LAG() and LEAD(). Calculating Month-to-Month and Quarter-to-Quarter Differences In the previous section, we couldn’t calculate a consistent value for the YOY Difference for 2021 because we don’t have complete results for 2021. But to see how a business is performing in the current year, we should have some metrics describing its performance. Good key performance indicators can be month-to-month or quarter-to-quarter comparisons (i.e. comparing the first quarter of 2021 to the first quarter of 2020). Before calculating month-over-month or quarter-over-quarter differences, let’s look at the daily_metrics table. It has a record for each day that describes the revenue, costs, and number of new customers for that day. DayRevenueCostNew Customers 2019-01-01108004650120 2019-01-0210807465080 2020-01-0113720720025 2020-01-0213720720033 2021-01-0112262780010 2021-01-0217388780028 Next, we’ll calculate the monthly_metrics CTE ( a CTE is similar to a view, but it is created during query execution). This has a schema similar to the yearly_metrics table used previously. The monthly_metrics creation SELECT is shown in red; in blue, we can see the SQL query that uses this CTE like a regular table. WITH monthly_metrics AS ( SELECT extract(year from day) as year, extract(month from day) as month, SUM(revenue) as revenue FROM daily_metrics GROUP BY year, month ) SELECT year, month, revenue, LAG(revenue) OVER (ORDER BY year, month) as Revenue_previous_month, revenue - LAG(revenue) OVER (ORDER BY year, month) as Month_to_month_difference FROM monthly_metrics ORDER BY 1,2; In the above query, we first generate the monthly_metrics CTE using a SELECT that extracts the year and month from the day column. Then grouping by year and month, we calculate the SUM of revenue for each month. The CTE monthly_metrics works like a regular table with the columns year, month, and revenue; when the query ends, the CTE is destroyed. You can compute the quarter-to-quarter difference in a similar way. After that (in blue), we obtain the revenue for the previous month using the LAG() window function. Then we calculate the difference between each month and the previous month. Here’s a partial view of the result: YearMonthRevenueRevenue Previous MonthMonth to Month Difference 20191238568937476910920 20201385805385689116 20202370437385805-15368 There is another way to calculate a month-over-month difference. Instead of comparing against the previous month, we can compare against the same month in the previous year. To make this comparison, we need to use the LAG() function’s optional offset parameter, as we can see in the following query: WITH monthly_metrics AS ( SELECT EXTRACT(year from day) as year, EXTRACT(month from day) as month, SUM(revenue) as revenue FROM daily_metrics GROUP BY 1,2 ) SELECT year AS current_year, month AS current_month, revenue AS revenue_current_month, LAG(year,12) OVER ( ORDER BY year, month) AS previous_year, LAG(month,12) OVER ( ORDER BY year, month) AS month_comparing_with, LAG(revenue,12) OVER ( ORDER BY year, month) AS revenue_12_months_ago, revenue - LAG(revenue,12) OVER (ORDER BY year, month) AS month_to_month_difference FROM monthly_metrics ORDER BY 1,2; Above, we used the LAG() function with the optional offset parameter, which allows us to obtain a column value from a record N positions before the current record. If we use an offset of 12, we’ll obtain the record for the same month but in the previous year. Below, we can see the result: Current YearCurrentMonthRevenueCurrent MonthYear ComparingWithMonth ComparingWithRevenue12 MonthsAgoMonth to Month Difference 202013858052019133662849177 202023704372019230656463873 202033955842019334654349041 And that’s it! Now you know how to find month-over-month, quarter-to-quarter, and year-over-year differences with SQL window functions. I’d really suggest the Window Functions course as a good introduction to working with these functions. If you want to learn more, check out this article describing the Window Functions course. What’s Next? Window functions are a key SQL resource. In this article, we used the LAG() window function to calculate year-to-year and month-to-month differences. In fact, we can use it to calculate the difference between any time periods – quarters, half years, months, or weeks. In this article, we focused on differences between time periods, but window functions can be applied to solve many different kinds of data problems. Last but not least, I want to suggest our Window Functions cheat sheet, which I have stuck on a cork board in front of my desk. Invest in you, and grow your SQL skills! Tags: Window Functions