Back to articles list Articles Cookbook
8 minutes read

What Is the MySQL OVER Clause?

If you want to learn window functions in MySQL, you need to understand the OVER clause. In this article, we use real-world examples to explain what the MySQL OVER clause is, how it works, and why it’s so awesome.

In 2018, MySQL introduced a new feature: window functions, which are accessed via the OVER clause. Window functions are a super powerful resource available in almost all SQL databases. They perform a specific calculation (e.g. sum, count, average, etc.) on a set of rows; this set of rows is called a “window” and is defined by the MySQL OVER clause.

In this article, we will explain how to use the MySQL OVER clause in different scenarios. This will also introduce you to several window functions. I hope that, after reading this article, you'll be convinced that it's worthwhile to learn window functions. You’ll also know the basics of how to apply them in your queries!

Window Functions: A Very Requested Feature

Window functions have been available in most major databases for quite some time, but until 2018 they were not available in MySQL. To keep MySQL up to date, window functions were introduced in MySQL 8.02. If you plan to work with MySQL version 8, it’s worth learning window functions and the OVER clause, as they’re very powerful.

When would you use window functions? There are many times they come in useful, e.g. calculating the average salary for a certain group of employees. In this case, the definition of the group is the central point; you don’t want the average of all employee salaries, and if you get the group wrong the result will be wrong. Defining a group of records is the reason for the OVER clause: it dictates where the window function will work.

As we go through this article, we’ll create some query examples based on a database containing records for a group of farmers who produce oranges. The farmers share their production data, which is stored in the orange_production table you see below :

farmer_nameorange_varietycrop_yearnumber_of_treeskilos_producedyear_rainkilo_price
PierreGolden20152400825004001.21
PierreGolden20162400510001801.35
OlekGolden20174000780002501.42
SimonSuperSun20173500750002501.05
PierreGolden20172400625002501.42
OlekGolden20184100690001501.48
SimonSuperSun20183500740001501.07
PierreGolden20182450645002001.43

The simplest use case is an empty OVER clause; this means the window of records is the complete set of records returned by the query. For example, if our farmers want to have a report of every farmer record alongside the total of orange production in 2017, we’d write this query:

SELECT farmer_name, 
       kilos_produced, 
       SUM(kilos_produced) OVER() total_produced
 FROM  orange_production 
WHERE  crop_year = 2017

Here, the OVER clause constructs a window that includes all the records returned by the query – in other words, all the records for year 2017. The result is:

farmer_namekilos_producedtotal_produced
Olek78000215500
Simon75000215500
Pierre62500215500

How a Sliding Window Frame Works

That was a very simple example of the MySQL OVER clause. The window of records was static (the window was the same for all the rows returned by the query). However, one of the strong points of the OVER clause is being able to create a dynamic window of records (also called a sliding window frame). Sliding or dynamic window frames mean the window of records can be different for each row returned by the query. Moreover, the window is created based on the current row in the query, so the rows in the window can change when the current row changes.

Let’s look at an example of a sliding window. Suppose our farmers want to see their own production along with the total production of the same orange variety.

SELECT farmer_name, 
       orange_variety, 
	 crop_year,
       kilos_produced, 
       SUM(kilos_produced) OVER(PARTITION BY orange_variety) AS total_same_variety
FROM orange_production 

The clause OVER(PARTITION BY orange_variety) creates windows by grouping all the records with the same value in the orange_variety column. This gives us two windows: ‘Golden’ and ‘SuperSun’. In the table below, each window is shown in a different color:

farmer_nameorange_varietycrop_yearnumber_of_treeskilos_producedyear_rainkilo_ price
PierreGolden20152400825004001.21
PierreGolden20162400510001801.35
OlekGolden20174000780002501.42
PierreGolden20172400625002501.42
OlekGolden20184100690001501.48
PierreGolden20182450645002001.43
SimonSuperSun20173500750002501.05
SimonSuperSun20183500740001501.07

Now you can see the result of the query:

farmer_nameorange_varietycrop_yearkilos_producedtotal_same_variety
PierreGolden201582500407500
PierreGolden201651000407500
OlekGolden201778000407500
PierreGolden201762500407500
OlekGolden201869000407500
PierreGolden201864500407500
SimonSuperSun201775000149000
SimonSuperSun201874000149000

Notice that the Total Same Variety column (on the far right) includes the production for all years. Perhaps each farmer prefers to compare his production against the total production for the same variety in the same year. This comparison allows them to see their share of the production rate. To do that, we need to add the column crop_year to the PARTITION BY clause. The query will be as follows:

SELECT farmer, 
       orange_variety, 
	 crop_year,
       kilos_produced, 
       SUM(kilos_produced) OVER(PARTITION BY orange_variety, crop_year)                                                                                                                    AS total_same_variety_year
   FROM orange_production 

The clause OVER(PARTITION BY orange_variety, crop_year) creates windows by grouping all records with the same value in the orange_variety and crop_year columns. Below, we’ve again used different colors to show the windows of rows created by this OVER clause:

farmer_nameorange_varietycrop_yearnumber_of_treeskilos_producedyear_rainkilo_price
PierreGolden20152400825004001.21
PierreGolden20162400510001801.35
OlekGolden20174000780002501.42
PierreGolden20172400625002501.42
OlekGolden20184100690001501.48
PierreGolden20182450645002001.43
SimonSuperSun20173500750002501.05
SimonSuperSun20183500740001501.07

And the query results are:

farmer_nameorange_varietycrop_yearkilos_producedtotal_same_variety_year
PierreGolden20158250082500
PierreGolden20165100051000
OlekGolden201778000140500
PierreGolden201762500140500
OlekGolden201869000133500
PierreGolden201864500133500
SimonSuperSun20177500075000
SimonSuperSun20187400074000

If you are interested in going deeper into MySQL window functions, I suggest LearnSQL.com’s Window Functions course, where you can find a complete description of this topic and several examples. If you’ll be using window functions frequently, this cheat sheet – a super-complete quick guide to window functions – is very helpful.

But let’s proceed with our own exploration of OVER and window functions in MySQL first.

Learning the MySQL OVER Clause by Example

In this section, we’ll explore several example queries that showcase different usages of the OVER clause in MySQL.

First up, we’ll use the sub-clause ORDER BY in the OVER clause. ORDER BY will generate a window with the records ordered by a defined criteria. Some functions (like SUM(), LAG(), LEAD(), and NTH_VALUE()) can return different results depending on the order of the rows inside the window. Let’s suppose that Farmer Pierre wants to know his cumulative production over the years:

SELECT farmer, 
       crop_year,
       kilos_produced, 
       SUM(kilos_produced) OVER(ORDER BY crop_year) cumulative_previous_years
 FROM  orange_production
WHERE  farmer = ‘Pierre’

The SUM(kilos_produced) OVER(ORDER BY crop_year) window function works on an ordered window. And it only considers the current and previous rows (i.e. current and previous crop_year values). We can see the result of this cumulative SUM() in the result table:

farmer_namecrop_yearkilos_producedcumulative_previous_years
Pierre20158250082500
Pierre201651000133500
Pierre201762500196000
Pierre201864500260500

In the next example, we will combine two sub-clauses (PARTITION BY and ORDER BY) in the OVER clause. Suppose the farmers want a report showing the total produced by each farmer every year and the total of the previous years. Then we need to partition by the farmer column and order by crop_year:

SELECT farmer, 
       crop_year,
       kilos_produced, 
       SUM(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) cumulative_kilos_produced
   FROM orange_production

The following image shows the windows partitioned by farmer_name in different colors; note that inside each window, the rows are ordered by crop_year.

farmer_nameorange_varietycrop_yearnumber_of_treeskilos_ producedyear_rainkilo_price
PierreGolden20152400825004001.21
PierreGolden20162400510001801.35
PierreGolden20172400625002501.42
PierreGolden20182450645002001.43
SimonSuperSun20173500750002501.05
SimonSuperSun20183500740001501.07
OlekGolden20174000780002501.42
OlekGolden20184100690001501.48

The expression SUM(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) orders rows in the partition (which is based on the farmer value) using crop_year values. You can see this in the results table:

farmer_ namecrop_ yearkilos_ producedcumulative_ previous_years
Olek20177800078000
Olek201869000147000
Pierre20158250082500
Pierre201651000133500
Pierre201762500196000
Pierre201864500260500
Simon20177500075000
Simon201874000149000

For more information on combining the PARTITION BY and ORDER BY clauses, see the article SQL Window Functions by Explanation.

Using Positional Functions on Ordered Windows

So, you can now see the importance of being able to order rows within a window. In fact, some MySQL window functions only work on ordered windows. For example, the LAG() function allows us to obtain a column value from the previous row (related to the current row) in an ordered window.

Let’s say we want a report to show the previous year’s production compared to the current year’s production. To do this, we use the LAG() function on a window ordered by crop_year:

SELECT farmer, 
       crop_year,
       kilos_produced AS current_year_production, 
       LAG(kilos_produced) OVER(PARTITION BY farmer 
ORDER BY crop_year)AS previous_year_production
  FROM orange_production

The LAG() function can return any column value from the row previous to the current row, as the following query result shows:

farmer_namecrop_yearkilos_producedprevious_year_production
Olek201778000NULL
Olek20186900078000
Pierre201582500NULL
Pierre20165100082500
Pierre20176250051000
Pierre20186450062500
Simon201775000NULL
Simon20187400075000

The LEAD() function returns any column value from the row after the current row. Also, the NTH_VALUE() function returns the row in any specified position in an ordered window (e.g. first, second, last). These MySQL window functions help you easily create complex reports.

Let’s try a mathematical operation using MySQL window functions. Suppose the farmers want a report with the production delta related to the previous year. We can use the LAG() function to calculate the difference between the previous and current years’ productions:

SELECT farmer, 
       crop_year,
       kilos_produced current_year_production, 
       kilos_produced - LAG(kilos_produced) OVER(PARTITION BY farmer 
                         ORDER BY crop_year) AS production_delta
   FROM orange_production

The expression kilos_produced - LAG(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) calculates the production delta between the previous and the current years, as we can see in the results:

farmer_namecrop_yearkilos_producedproduction_ delta
Olek201778000NULL
Olek201869000-9000
Pierre201582500NULL
Pierre201651000-31500
Pierre20176250011500
Pierre2018645002000
Simon201775000NULL
Simon201874000-1000

For those readers who want to go deeper into positional functions, I suggest the article Common SQL Window Functions: Positional Functions, which has several examples of this interesting kind of window function.

More About MySQL OVER and Window Functions

In this article, we covered several examples of using the MySQL OVER clause, from very simple to fairly complex usages. Knowing how OVER works and what rows it includes in the window is critical to using window functions.

There are many MySQL window functions you can experiment with: AVG(), MIN(), MAX(), LAG(), LEAD() and NTH_VALUE(). All of them use the OVER clause in the same way we just explained.

Finally, for those readers who want to learn more about MySQL window functions, I recommend the interactive course Window Functions. You can read more about it in our post SQL Course of the Month – Window Functions. You’ll find detailed explanations and lots of examples using different window functions in the course. And remember – when you increase your MySQL skills, you increase your assets!