24th Sep 2020 8 minutes read What Is the MySQL OVER Clause? Ignacio L. Bisso window functions MySQL Table of Contents Window Functions: A Very Requested Feature How a Sliding Window Frame Works Learning the MySQL OVER Clause by Example Using Positional Functions on Ordered Windows More About MySQL OVER and Window Functions If you want to learn window functions in MySQL, you need to understand the OVER clause. In this article, we use real-world examples to explain what the MySQL OVER clause is, how it works, and why it’s so awesome. In 2018, MySQL introduced a new feature: window functions, which are accessed via the OVER clause. Window functions are a super powerful resource available in almost all SQL databases. They perform a specific calculation (e.g. sum, count, average, etc.) on a set of rows; this set of rows is called a “window” and is defined by the MySQL OVER clause. In this article, we will explain how to use the MySQL OVER clause in different scenarios. This will also introduce you to several window functions. I hope that, after reading this article, you'll be convinced that it's worthwhile to learn window functions. You’ll also know the basics of how to apply them in your queries! Window Functions: A Very Requested Feature Window functions have been available in most major databases for quite some time, but until 2018 they were not available in MySQL. To keep MySQL up to date, window functions were introduced in MySQL 8.02. If you plan to work with MySQL version 8, it’s worth learning window functions and the OVER clause, as they’re very powerful. When would you use window functions? There are many times they come in useful, e.g. calculating the average salary for a certain group of employees. In this case, the definition of the group is the central point; you don’t want the average of all employee salaries, and if you get the group wrong the result will be wrong. Defining a group of records is the reason for the OVER clause: it dictates where the window function will work. As we go through this article, we’ll create some query examples based on a database containing records for a group of farmers who produce oranges. The farmers share their production data, which is stored in the orange_production table you see below : farmer_nameorange_varietycrop_yearnumber_of_treeskilos_producedyear_rainkilo_price PierreGolden20152400825004001.21 PierreGolden20162400510001801.35 OlekGolden20174000780002501.42 SimonSuperSun20173500750002501.05 PierreGolden20172400625002501.42 OlekGolden20184100690001501.48 SimonSuperSun20183500740001501.07 PierreGolden20182450645002001.43 The simplest use case is an empty OVER clause; this means the window of records is the complete set of records returned by the query. For example, if our farmers want to have a report of every farmer record alongside the total of orange production in 2017, we’d write this query: SELECT farmer_name, kilos_produced, SUM(kilos_produced) OVER() total_produced FROM orange_production WHERE crop_year = 2017 Here, the OVER clause constructs a window that includes all the records returned by the query – in other words, all the records for year 2017. The result is: farmer_namekilos_producedtotal_produced Olek78000215500 Simon75000215500 Pierre62500215500 How a Sliding Window Frame Works That was a very simple example of the MySQL OVER clause. The window of records was static (the window was the same for all the rows returned by the query). However, one of the strong points of the OVER clause is being able to create a dynamic window of records (also called a sliding window frame). Sliding or dynamic window frames mean the window of records can be different for each row returned by the query. Moreover, the window is created based on the current row in the query, so the rows in the window can change when the current row changes. Let’s look at an example of a sliding window. Suppose our farmers want to see their own production along with the total production of the same orange variety. SELECT farmer_name, orange_variety, crop_year, kilos_produced, SUM(kilos_produced) OVER(PARTITION BY orange_variety) AS total_same_variety FROM orange_production The clause OVER(PARTITION BY orange_variety) creates windows by grouping all the records with the same value in the orange_variety column. This gives us two windows: ‘Golden’ and ‘SuperSun’. In the table below, each window is shown in a different color: farmer_nameorange_varietycrop_yearnumber_of_treeskilos_producedyear_rainkilo_ price PierreGolden20152400825004001.21 PierreGolden20162400510001801.35 OlekGolden20174000780002501.42 PierreGolden20172400625002501.42 OlekGolden20184100690001501.48 PierreGolden20182450645002001.43 SimonSuperSun20173500750002501.05 SimonSuperSun20183500740001501.07 Now you can see the result of the query: farmer_nameorange_varietycrop_yearkilos_producedtotal_same_variety PierreGolden201582500407500 PierreGolden201651000407500 OlekGolden201778000407500 PierreGolden201762500407500 OlekGolden201869000407500 PierreGolden201864500407500 SimonSuperSun201775000149000 SimonSuperSun201874000149000 Notice that the Total Same Variety column (on the far right) includes the production for all years. Perhaps each farmer prefers to compare his production against the total production for the same variety in the same year. This comparison allows them to see their share of the production rate. To do that, we need to add the column crop_year to the PARTITION BY clause. The query will be as follows: SELECT farmer, orange_variety, crop_year, kilos_produced, SUM(kilos_produced) OVER(PARTITION BY orange_variety, crop_year) AS total_same_variety_year FROM orange_production The clause OVER(PARTITION BY orange_variety, crop_year) creates windows by grouping all records with the same value in the orange_variety and crop_year columns. Below, we’ve again used different colors to show the windows of rows created by this OVER clause: farmer_nameorange_varietycrop_yearnumber_of_treeskilos_producedyear_rainkilo_price PierreGolden20152400825004001.21 PierreGolden20162400510001801.35 OlekGolden20174000780002501.42 PierreGolden20172400625002501.42 OlekGolden20184100690001501.48 PierreGolden20182450645002001.43 SimonSuperSun20173500750002501.05 SimonSuperSun20183500740001501.07 And the query results are: farmer_nameorange_varietycrop_yearkilos_producedtotal_same_variety_year PierreGolden20158250082500 PierreGolden20165100051000 OlekGolden201778000140500 PierreGolden201762500140500 OlekGolden201869000133500 PierreGolden201864500133500 SimonSuperSun20177500075000 SimonSuperSun20187400074000 If you are interested in going deeper into MySQL window functions, I suggest LearnSQL.com’s Window Functions course, where you can find a complete description of this topic and several examples. If you’ll be using window functions frequently, this cheat sheet – a super-complete quick guide to window functions – is very helpful. But let’s proceed with our own exploration of OVER and window functions in MySQL first. Learning the MySQL OVER Clause by Example In this section, we’ll explore several example queries that showcase different usages of the OVER clause in MySQL. First up, we’ll use the sub-clause ORDER BY in the OVER clause. ORDER BY will generate a window with the records ordered by a defined criteria. Some functions (like SUM(), LAG(), LEAD(), and NTH_VALUE()) can return different results depending on the order of the rows inside the window. Let’s suppose that Farmer Pierre wants to know his cumulative production over the years: SELECT farmer, crop_year, kilos_produced, SUM(kilos_produced) OVER(ORDER BY crop_year) cumulative_previous_years FROM orange_production WHERE farmer = ‘Pierre’ The SUM(kilos_produced) OVER(ORDER BY crop_year) window function works on an ordered window. And it only considers the current and previous rows (i.e. current and previous crop_year values). We can see the result of this cumulative SUM() in the result table: farmer_namecrop_yearkilos_producedcumulative_previous_years Pierre20158250082500 Pierre201651000133500 Pierre201762500196000 Pierre201864500260500 In the next example, we will combine two sub-clauses (PARTITION BY and ORDER BY) in the OVER clause. Suppose the farmers want a report showing the total produced by each farmer every year and the total of the previous years. Then we need to partition by the farmer column and order by crop_year: SELECT farmer, crop_year, kilos_produced, SUM(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) cumulative_kilos_produced FROM orange_production The following image shows the windows partitioned by farmer_name in different colors; note that inside each window, the rows are ordered by crop_year. farmer_nameorange_varietycrop_yearnumber_of_treeskilos_ producedyear_rainkilo_price PierreGolden20152400825004001.21 PierreGolden20162400510001801.35 PierreGolden20172400625002501.42 PierreGolden20182450645002001.43 SimonSuperSun20173500750002501.05 SimonSuperSun20183500740001501.07 OlekGolden20174000780002501.42 OlekGolden20184100690001501.48 The expression SUM(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) orders rows in the partition (which is based on the farmer value) using crop_year values. You can see this in the results table: farmer_ namecrop_ yearkilos_ producedcumulative_ previous_years Olek20177800078000 Olek201869000147000 Pierre20158250082500 Pierre201651000133500 Pierre201762500196000 Pierre201864500260500 Simon20177500075000 Simon201874000149000 For more information on combining the PARTITION BY and ORDER BY clauses, see the article SQL Window Functions by Explanation. Using Positional Functions on Ordered Windows So, you can now see the importance of being able to order rows within a window. In fact, some MySQL window functions only work on ordered windows. For example, the LAG() function allows us to obtain a column value from the previous row (related to the current row) in an ordered window. Let’s say we want a report to show the previous year’s production compared to the current year’s production. To do this, we use the LAG() function on a window ordered by crop_year: SELECT farmer, crop_year, kilos_produced AS current_year_production, LAG(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year)AS previous_year_production FROM orange_production The LAG() function can return any column value from the row previous to the current row, as the following query result shows: farmer_namecrop_yearkilos_producedprevious_year_production Olek201778000NULL Olek20186900078000 Pierre201582500NULL Pierre20165100082500 Pierre20176250051000 Pierre20186450062500 Simon201775000NULL Simon20187400075000 The LEAD() function returns any column value from the row after the current row. Also, the NTH_VALUE() function returns the row in any specified position in an ordered window (e.g. first, second, last). These MySQL window functions help you easily create complex reports. Let’s try a mathematical operation using MySQL window functions. Suppose the farmers want a report with the production delta related to the previous year. We can use the LAG() function to calculate the difference between the previous and current years’ productions: SELECT farmer, crop_year, kilos_produced current_year_production, kilos_produced - LAG(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) AS production_delta FROM orange_production The expression kilos_produced - LAG(kilos_produced) OVER(PARTITION BY farmer ORDER BY crop_year) calculates the production delta between the previous and the current years, as we can see in the results: farmer_namecrop_yearkilos_producedproduction_ delta Olek201778000NULL Olek201869000-9000 Pierre201582500NULL Pierre201651000-31500 Pierre20176250011500 Pierre2018645002000 Simon201775000NULL Simon201874000-1000 For those readers who want to go deeper into positional functions, I suggest the article Common SQL Window Functions: Positional Functions, which has several examples of this interesting kind of window function. More About MySQL OVER and Window Functions In this article, we covered several examples of using the MySQL OVER clause, from very simple to fairly complex usages. Knowing how OVER works and what rows it includes in the window is critical to using window functions. There are many MySQL window functions you can experiment with: AVG(), MIN(), MAX(), LAG(), LEAD() and NTH_VALUE(). All of them use the OVER clause in the same way we just explained. Finally, for those readers who want to learn more about MySQL window functions, I recommend the interactive course Window Functions. You can read more about it in our post SQL Course of the Month – Window Functions. You’ll find detailed explanations and lots of examples using different window functions in the course. And remember – when you increase your MySQL skills, you increase your assets! Tags: window functions MySQL