How to Compute Year-Over-Year Differences in SQL

How did your business do this year compared to last year? The year before? Find out how to use SQL to calculate year to year and month to month differences.

Year-over-year (YOY) comparisons are a popular and effective way to evaluate the performance of several kinds of organizations. Year to year differences are easy to understand – e.g. it’s easy to understand that your revenue growth is 35% year to year. However, calculating this metric in SQL is not so easy!

In this article, we are going to learn how to calculate year-to-year and month-over-month differences using SQL.

What Are Year-Over-Year Metrics From a Business Perspective?

To know if a business organization is performing well, we usually need to do a comparison based on business metrics. Sometimes we can compare against similar companies or with the average of several companies in the same market. However, one key performance indicator is periodic growth: the comparison of your company’s current results against the same metric from a previous time period. This is the reason for year-to-year comparisons: You can easily see if your organization is doing better (or worse) than last year. Moreover, year-to-year comparisons can be applied to different metrics (e.g. sales, profits, number of customers) to better understand how different business indicators are evolving.

In the following table, we can see the results for 2019 and 2020:

Metrics20192020
Revenue\$4 300 000\$4 800 000
Costs\$1 700 000\$2 600 000
Profit60%45%
Number of customers12 00012 200

If we add a third column called YOY, we can easily see the variation from year to year. This lets us better understand how our organization performed in the last year and what areas can be improved:

Metrics20192020YOY
Revenue\$4 300 000\$4 800 000\$500 000
Costs\$1 700 000\$2 600 000\$900 000
Profit60%45%-15%
Number of customers12 00012 200200

In this article, we will explain how we can calculate the values in the YOY column. Let’s start by showing the base table, where we have the metrics for 2019 and 2020. In the next image, we can see these values in the `yearly_metrics` table.

YearRevenueCostProfitNumber_of_customers
2019430000017000006012000
2020480000026000004512200
202118000007500005812280

In the next section, we’ll explain how to query the `yearly_metrics` table using SQL to obtain the values in the YOY column. Before that, I would like to suggest LearnSQL.com’s Window Functions course, where you can learn the basics of SQL window functions. Also, the article When Do I Use SQL Window Functions provides a lot of example queries that you may find helpful.

How to Calculate YOY metrics with SQL

Window functions are a very powerful SQL feature. They return the result of applying a function (like `MAX()`, `AVG()` or `COUNT()`) to a set of records (which is called the “window”) in a table. The set of records is defined by the `OVER()` clause; this clause is mandatory for window function. You can also order the records in the window by different criteria and then use functions like `FIRST_VALUE()`, `LAST_VALUE()`, `LEAD()`, or `LAG()` to return the value from specific records in relation to the current record.

If you want to review the basics of window functions, I suggest the article What Is the Over Clause?. Understanding SQL window functions and the `OVER` clause will help you with the concepts we’ll discuss in this article.

Now, let’s see an example of a SQL query that returns the revenue for each year and its preceding year:

```SELECT year,
revenue,
LAG(revenue) OVER ( ORDER BY year ) AS Revenue_Previous_Year
FROM   yearly_metrics
```

In this example, we are using the `LAG()` window function to obtain the value of the column revenue for the previous record. In other words, if the current record is for 2020, `LAG(revenue)` will return the value of the revenue column for the year 2019. The results of this query are:

YearRevenueRevenue Previous Year
20194300000NULL
202048000004300000
202118000004800000

The next step to obtain the YOY value for revenue is simple; we only need to calculate the difference between revenue for 2020 and revenue for 2019. Here’s how we’d do that:

```SELECT year,
revenue,
LAG(revenue) OVER ( ORDER BY year ) AS Revenue_Previous_Year
revenue - LAG(revenue) OVER ( ORDER BY year ) AS YOY_Difference
FROM   yearly_metrics
```

We calculated the `YOY Difference` by doing an arithmetic difference between revenue in 2020 and revenue in 2019. In the following result, we can see the record for 2019 doesn’t have a `YOY Difference` value because we don’t have data for 2018. Also, the record for 2021 doesn’t have a consistent value in `YOY Difference` because the data for 2021 is not complete.

YearRevenueRevenue Previous YearYOY Difference
20194300000NULLNULL
202048000004300000500000
202118000004800000-3000000

Before going to the next section, I’d like to suggest the article How to Calculate the Difference Between Two Rows in SQL, which goes deeper into the calculation of differences using `LAG()` and `LEAD()`.

Calculating Month-to-Month and Quarter-to-Quarter Differences

In the previous section, we couldn’t calculate a consistent value for the `YOY Difference` for 2021 because we don’t have complete results for 2021. But to see how a business is performing in the current year, we should have some metrics describing its performance. Good key performance indicators can be month-to-month or quarter-to-quarter comparisons (i.e. comparing the first quarter of 2021 to the first quarter of 2020).

Before calculating month-over-month or quarter-over-quarter differences, let’s look at the `daily_metrics` table. It has a record for each day that describes the revenue, costs, and number of new customers for that day.

DayRevenueCostNew Customers
2019-01-01108004650120
2019-01-0210807465080
2020-01-0113720720025
2020-01-0213720720033
2021-01-0112262780010
2021-01-0217388780028

Next, we’ll calculate the `monthly_metrics` CTE ( a CTE is similar to a view, but it is created during query execution). This has a schema similar to the `yearly_metrics` table used previously. The `monthly_metrics` creation SELECT is shown in red; in blue, we can see the SQL query that uses this CTE like a regular table.

```WITH monthly_metrics AS (
SELECT
extract(year from day) as year,
extract(month from day) as month,
SUM(revenue) as revenue
FROM daily_metrics
GROUP BY year, month
)
SELECT
year, month, revenue,
LAG(revenue) OVER (ORDER BY year, month) as Revenue_previous_month,
revenue - LAG(revenue) OVER (ORDER BY year, month) as Month_to_month_difference
FROM monthly_metrics
ORDER BY 1,2;
```

In the above query, we first generate the `monthly_metrics` CTE using a `SELECT` that extracts the year and month from the day column. Then grouping by year and month, we calculate the `SUM` of revenue for each month. The CTE `monthly_metrics` works like a regular table with the columns `year`, `month`, and `revenue`; when the query ends, the CTE is destroyed. You can compute the quarter-to-quarter difference in a similar way.

After that (in blue), we obtain the revenue for the previous month using the `LAG()` window function. Then we calculate the difference between each month and the previous month. Here’s a partial view of the result:

YearMonthRevenueRevenue Previous MonthMonth to Month Difference
20191238568937476910920
20201385805385689116
20202370437385805-15368

There is another way to calculate a month-over-month difference. Instead of comparing against the previous month, we can compare against the same month in the previous year. To make this comparison, we need to use the `LAG()` function’s optional offset parameter, as we can see in the following query:

```
WITH monthly_metrics AS (
SELECT EXTRACT(year from day) as year,
EXTRACT(month from day) as month,
SUM(revenue) as revenue
FROM daily_metrics
GROUP BY 1,2
)
SELECT year AS current_year,
month AS current_month,
revenue AS revenue_current_month,
LAG(year,12) OVER ( ORDER BY year, month) AS previous_year,
LAG(month,12) OVER ( ORDER BY year, month) AS month_comparing_with,
LAG(revenue,12) OVER ( ORDER BY year, month) AS revenue_12_months_ago,
revenue - LAG(revenue,12) OVER (ORDER BY year, month) AS month_to_month_difference
FROM monthly_metrics
ORDER BY 1,2;

```

Above, we used the `LAG()` function with the optional offset parameter, which allows us to obtain a column value from a record N positions before the current record. If we use an offset of 12, we’ll obtain the record for the same month but in the previous year. Below, we can see the result:

Current YearCurrent
Month
Revenue
Current Month
Year Comparing
With
Month Comparing
With
Revenue
12 Months
Ago
Month to Month Difference
202013858052019133662849177
202023704372019230656463873
202033955842019334654349041

And that’s it! Now you know how to find month-over-month, quarter-to-quarter, and year-over-year differences with SQL window functions. I’d really suggest the Window Functions course as a good introduction to working with these functions. If you want to learn more, check out this article describing the Window Functions course.

What’s Next?

Window functions are a key SQL resource. In this article, we used the `LAG()` window function to calculate year-to-year and month-to-month differences. In fact, we can use it to calculate the difference between any time periods – quarters, half years, months, or weeks.

In this article, we focused on differences between time periods, but window functions can be applied to solve many different kinds of data problems. Last but not least, I want to suggest our Window Functions cheat sheet, which I have stuck on a cork board in front of my desk. Invest in you, and grow your SQL skills!