6th Jul 2021 12 minutes read RANGE Clause in Window Functions: 5 Practical Examples Tihomir Babic sql learn sql Window Functions Table of Contents RANGE Clause Syntax Are the RANGE and the ROW Clauses the Same? Which Data Types Can I Use in the RANGE clause? Example 1 – Calculate the Running Total Example 2 – Calculate the Moving Average Moving Average for Databases That Don't Support Using RANGE with Date/Time Data Types Example 3 – Find the Last Value Within a Range Example 4 – Find the Number of Items Within a Range Example 5 – Find the Maximum Value RANGE Really Does Have Quite a Range of Uses, Doesn't it? What is a RANGE clause in SQL window functions? Five practical examples will show you how and when to use it. The RANGE clause is used quite rarely in SQL window functions. I don't know why; maybe people are not used to it. This is a shame, because it's far from a pointless clause; it can be very useful, and I'll show you that in five examples. To learn how window functions work, what functions there are, and how to apply them to real-world problems, it’s best to take the Window Functions course. It’s interactive, there are 218 exercises, and you only need a web browser and some basic SQL knowledge. RANGE Clause Syntax When I'm talking about the RANGE clause, I'm talking about the one used in SQL window functions, which have the following syntax: OVER ( [ <PARTITION BY clause> ] [ <ORDER BY clause> ] [ <ROW or RANGE clause> ] ) When you look at the syntax above, you see that both ROW or RANGE can be part of the window function. Their syntax is as follows: [<ROWS or RANGE clause> BETWEEN <Start expr> AND <End expr>] The default window frame without the ORDER BY is the whole partition. But when you use the ORDER BY, the default window frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. It's normal to forget this syntax, especially if it's new to you or you don't use it very often. Feel free to have your Window Functions Cheat Sheet open while reading this article. Ready to dive in? Great! First, let's talk about RANGE and ROW. Are the RANGE and the ROW Clauses the Same? Nope, they're not. They do, however, have the same purpose: to specify the starting and ending points within the partition, with the goal of limiting rows. However, each clause does it differently. The ROW clause does it by specifying a fixed number of rows that precede or follow the current row. The RANGE clause, on the other hand, limits the rows logically; it specifies the range of values in relation to the value of the current row. Which Data Types Can I Use in the RANGE clause? You can use two data types with the RANGE clause: numeric and date/time types. You can use the numeric data types and the RANGE clause in almost all popular databases. Unfortunately, Microsoft SQL Server does not support it. Regarding the date/time data types, only a very few popular databases support using them with the RANGE clause. Those are PostgreSQL, MySQL, and Oracle DB. Let's now go to the examples and see how RANGE works in practice! If you're new to SQL window functions, maybe you should check what window functions are and how they compare to aggregate functions first. Come back when you're done reading and we'll continue. Example 1 – Calculate the Running Total The data I'll be working with is in the table revenue. The columns are: id – The date's ID and the table's primary key (PK). date – The revenue's date. revenue_amount – The amount of the revenue. Your task is to calculate running revenue totals using the RANGE clause. Let's do it first without SQL. If you have the following data, what will be the running total (also known as a cumulative sum)? iddaterevenue_amountrunning_total 12021-05-0112,573.2512,573.25 22021-05-0211,348.2223,921.47 32021-05-0214,895.1338,816.60 42021-05-0314,388.1453,204.74 52021-05-0418,847.5472,052.28 The running total for 2021-05-01 equals the revenue for that date: 12,573.25. That's because there are no previous rows to include in the calculation. Then comes 2021-05-02. The running total is today's revenue added to the previous day's revenue: 11,348.22 + 12,573.25 = 23,921.47. Notice there is another row with a different amount of revenue for 2021-05-02. Maybe this is for another branch, country, product or whatever. It works the same way: 14,895.13 + 23,921.47 = 38,816.60. (The RANGE clause will work even if there are multiple rows with the same date.) Next comes 2021-05-03. The running total for this date will be 14,388.14 + 38,816.60 = 53,204.74. Finally, the running total for 2021-05-04 will be 18,847.54 + 53,204 = 72,052.28. How do you do the same using the RANGE clause? It could be done like this: SELECT id, date, revenue_amount, SUM(revenue_amount) OVER ( ORDER BY date RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_total FROM revenue; I’ve used the SUM() function on the column revenue_amount; this is the operation required to get the running total. For the SUM() function to become a window function, you need the OVER() clause. The window function calculation is done in ascending order; that’s because I want to make sure the revenue is being summed from the oldest to the newest date. Then comes the RANGE clause. It limits the window to the dates preceding the current date (BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) and the current date. Those are the rows that will be included in the running total calculation. Here’s what you get when you run the code: iddaterevenue_amountrunning_total 12021-05-0112,573.2512,573.25 22021-05-0211,348.2238,816.60 32021-05-0214,895.1338,816.60 42021-05-0314,388.1453,204.74 52021-05-0418,847.5472,052.28 62021-05-059,845.2981,897.57 72021-05-0614,574.5696,472.13 82021-05-0711,500.63107,972.76 92021-05-0816,897.21124,869.97 102021-05-099,634.56134,504.53 112021-05-1014,255.87148,760.40 122021-05-1111,248.33160,008.73 132021-05-1221,489.22181,497.95 142021-05-1314,448.65195,946.60 152021-05-1415,517.22211,463.82 162021-05-156,874.23218,338.05 172021-05-1612,500.00230,838.05 182021-05-179,784.33240,622.38 192021-05-1815,321.89255,944.27 202021-05-1912,235.50268,179.77 212021-05-2022,222.22290,401.99 212021-05-2114,800.65305,202.64 222021-05-225,894.12311,096.76 232021-05-2318,845.69329,942.45 242021-05-249,966.66339,909.11 252021-05-2513,250.69353,159.80 262021-05-264,987.56358,147.36 272021-05-2717,784.25375,931.61 282021-05-2812,567.45388,499.06 292021-05-2919,874.26408,373.32 302021-05-3015,489.36423,862.68 312021-05-3114,987.55438,850.23 Notice that where there are multiple values for one date (2021-05-02) the code will include both rows in the running total calculation for that date. That's why there's 38,816.60 in the running_total column for that date. Example 2 – Calculate the Moving Average Let's now see how you use the RANGE clause with date/time data types – and without the id column to help! We'll use a slightly modified table, which is now called revenue_per_shop. It contains the revenue data for two shops. The columns are: date – The revenue's date. shop – The shop's name. revenue_amount – The amount of revenue for that shop on that date. First of all, let's make sure you understand what a moving average is. A two-day moving average includes the current day and the previous day. Here's some example data to show you how a moving average works: daterevenue_amountmoving_avg 2021-05-0112,573.2512,573.25 2021-05-0211,348.2211,960.74 2021-05-0314,388.1412,868.18 2021-05-0418,847.5416,617.84 The two-day moving average for 2021-05-01 is the daily revenue itself: 12,573.25. That’s because there are no other rows to include in the calculation. The 2021-05-02 calculation includes two dates: (12,573.25 + 11,348.22)/2 = 11,960.74. The other rows follow the same two-date logic – the current date and the previous date. So how do you calculate the same metric for every shop separately? Like this: SELECT shop, date, revenue_amount, AVG(revenue_amount) OVER ( PARTITION BY shop ORDER BY date ASC RANGE BETWEEN INTERVAL '1' DAY PRECEDING AND CURRENT ROW ) AS moving_avg FROM revenue_per_shop; The code first selects some columns from the table. Then comes the fun part. I’m using the AVG() function on the column revenue_amount because I want the average revenue. Again, this is a window function, so it must have the OVER() clause. I use the PARTITION BY to specify the column on which I want to aggregate data; it’s the column shop because I want the moving average for every shop separately. The operation is again ordered by date. In the RANGE clause, I merely specify which rows to include in the calculation. Since I’m working with dates, I’ll get the previous date by stating: BETWEEN INTERVAL '1' DAY PRECEDING. Working with date/time data differs between databases. You may need to write it this way in some databases: RANGE BETWEEN 1 DAY PRECEDING AND CURRENT ROW. The result should, nevertheless, be the same: shopdaterevenue_amountmoving_avg Shop 12021-05-0112,573.2512,573.25 Shop 12021-05-0214,388.1413,480.70 Shop 12021-05-039,845.2912,116.72 Shop 12021-05-0411,500.6310,672.96 Shop 12021-05-059,634.5610,567.60 Shop 12021-05-0611,248.3310,441.45 Shop 12021-05-0714,448.6512,848.49 Shop 12021-05-086,874.2310,661.44 Shop 12021-05-099,784.338,329.28 Shop 12021-05-1012,235.5011,009.92 Shop 12021-05-1114,800.6513,518.08 Shop 12021-05-1218,845.6916,823.17 Shop 12021-05-1313,250.6916,048.19 Shop 12021-05-1417,784.2515,517.47 Shop 12021-05-1519,874.2618,829.26 Shop 22021-05-0111,348.2211,348.22 Shop 22021-05-0218,847.5415,097.88 Shop 22021-05-0314,574.5616,711.05 Shop 22021-05-0416,897.2115,735.89 Shop 22021-05-0514,255.8715,576.54 Shop 22021-05-0621,489.2217,872.55 Shop 22021-05-0715,517.2218,503.22 Shop 22021-05-0812,500.0014,008.61 Shop 22021-05-0915,321.8913,910.95 Shop 22021-05-1022,222.2218,772.06 Shop 22021-05-115,894.1214,058.17 Shop 22021-05-129,966.667,930.39 Shop 22021-05-134,987.567,477.11 Shop 22021-05-1412,567.458,777.51 Shop 22021-05-1515,489.3614,028.41 Moving Average for Databases That Don't Support Using RANGE with Date/Time Data Types What should you do if your database doesn't support using RANGE with date/time data types? There's a way to “trick” your database; in fact, there are probably several ways. Here's one way that calculates the difference between every date and 2021-05-01 (i.e. the first date in the data). You get the difference as an integer, which you can use instead of the date. This is the code: SELECT shop, date, revenue_amount, date - '2021_05_01' AS day_difference, AVG(revenue_amount) OVER ( PARTITION BY shop ORDER BY (date - '2021_05_01') RANGE BETWEEN 1 PRECEDING AND CURRENT ROW ) AS moving_avg FROM revenue_per_shop; I’ve intentionally included the column day_difference in the result so that you understand the logic. I’ve used the same difference in the ORDER BY clause so that I can use an integer with the RANGE clause. And the moving average is the same as in the example above; see for yourself. (I’m showing only the first five rows to save space.) shopdaterevenue_amountday_differencemoving_avg Shop 12021-05-0112,573.25012,573.25 Shop 12021-05-0214,388.14113,480.70 Shop 12021-05-039,845.29212,116.72 Shop 12021-05-0411,500.63310,672.96 Shop 12021-05-059,634.56410,567.60 Example 3 – Find the Last Value Within a Range This use of the RANGE clause allows you to find the last value within a defined range. For example, using the table revenue_by_shop, I can get the last value for every shop separately. The last value, in this case, means the last data available, which is the revenue for 2021-05-15. How do you get that data? By using the RANGE clause, of course: SELECT shop, date, revenue_amount, LAST_VALUE(revenue_amount) OVER ( PARTITION BY shop ORDER BY date RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS last_value FROM revenue_per_shop; The window function I’ve used this time is LAST_VALUE(). Once again, I’m using it on the column revenue_amount. I’ve partitioned the data by shop, same as before. And I ordered it by date, again the same as before. To get the last value, I used RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Remember, the default range with the ORDER BY clause is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. If you don’t change it, you’ll get the wrong result. The right result is: shopdaterevenue_amountlast_value Shop 12021-05-0112,573.2519,874.26 Shop 12021-05-0214,388.1419,874.26 Shop 12021-05-039,845.2919,874.26 Shop 12021-05-0411,500.6319,874.26 Shop 12021-05-059,634.5619,874.26 Shop 12021-05-0611,248.3319,874.26 Shop 12021-05-0714,448.6519,874.26 Shop 12021-05-086,874.2319,874.26 Shop 12021-05-099,784.3319,874.26 Shop 12021-05-1012,235.5019,874.26 Shop 12021-05-1114,800.6519,874.26 Shop 12021-05-1218,845.6919,874.26 Shop 12021-05-1313,250.6919,874.26 Shop 12021-05-1417,784.2519,874.26 Shop 12021-05-1519,874.2619,874.26 Shop 22021-05-0111,348.2215,489.36 Shop 22021-05-0218,847.5415,489.36 Shop 22021-05-0314,574.5615,489.36 Shop 22021-05-0416,897.2115,489.36 Shop 22021-05-0514,255.8715,489.36 Shop 22021-05-0621,489.2215,489.36 Shop 22021-05-0715,517.2215,489.36 Shop 22021-05-0812,500.0015,489.36 Shop 22021-05-0915,321.8915,489.36 Shop 22021-05-1022,222.2215,489.36 Shop 22021-05-115,894.1215,489.36 Shop 22021-05-129,966.6615,489.36 Shop 22021-05-134,987.5615,489.36 Shop 22021-05-1412,567.4515,489.36 Shop 22021-05-1515,489.3615,489.36 Example 4 – Find the Number of Items Within a Range Here's another fun and useful way to use the RANGE clause. How would you find the number of times the daily revenue for any shop was between 1,000 (dollars, euros…) below and above its current value? This code might help: SELECT shop, date, revenue_amount, COUNT(*) OVER ( ORDER BY revenue_amount ASC RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING ) AS number_of_days FROM revenue_per_shop; I’m using the COUNT() window function. Since I’m not interested in separating the revenue by shops, there’s no PARTITION BY. The counting will be performed in ascending order according to the revenue amount. The range is defined by RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING. Here’s what the code will return: shopdaterevenue_amountnumber_of_days Shop 22021-05-134,987.562 Shop 22021-05-115,894.123 Shop 12021-05-086,874.232 Shop 12021-05-059,634.564 Shop 12021-05-099,784.334 Shop 12021-05-039,845.294 Shop 22021-05-129,966.664 Shop 12021-05-0611,248.334 Shop 22021-05-0111,348.224 Shop 12021-05-0411,500.635 Shop 12021-05-1012,235.507 Shop 22021-05-0812,500.006 Shop 22021-05-1412,567.455 Shop 12021-05-0112,573.255 Shop 12021-05-1313,250.694 Shop 22021-05-0514,255.875 Shop 12021-05-0214,388.146 Shop 12021-05-0714,448.656 Shop 22021-05-0314,574.568 Shop 12021-05-1114,800.658 Shop 22021-05-0915,321.897 Shop 22021-05-1515,489.365 Shop 22021-05-0715,517.225 Shop 22021-05-0416,897.212 Shop 12021-05-1417,784.252 Shop 12021-05-1218,845.692 Shop 22021-05-0218,847.542 Shop 12021-05-1519,874.261 Shop 22021-05-0621,489.222 Shop 22021-05-1022,222.222 Let me explain what this result tells you. If you take the first row, the result in the column number_of_days is 2. There are two instances where the revenue is between 3,987.56 and 5,987.56. Why this range? The revenue for 2021-05-13 is 4,987.56. So 4,987.56 - 1,000 = 3,987.56 and 4,987.56 + 1,000 = 5,987.56. Do you want to check the result? Which two instances are between this range? Obviously, the first two: shopdaterevenue_amountnumber_of_days Shop 22021-05-134,987.562 Shop 22021-05-115,894.123 Do you want to check the second line? It says there are three instances between 4,894.12 and 6,894.12 – these three: shopdaterevenue_amountnumber_of_days Shop 22021-05-134,987.562 Shop 22021-05-115,894.123 Shop 12021-05-086,874.232 The same logic applies to the rest of the data. Example 5 – Find the Maximum Value This is not as simple as it may seem; I'm not talking about the common maximum value. To find that, you wouldn't need the RANGE clause. But how about finding the maximum value (or revenue, in this case) across five days? Those five days will include the current date, up to three days before that, and one day after the current date. You probably already know the logic after all these examples of RANGE usage. Here's my solution: SELECT shop, date, revenue_amount, MAX(revenue_amount) OVER ( ORDER BY DATE RANGE BETWEEN INTERVAL '3' DAY PRECEDING AND INTERVAL '1' DAY FOLLOWING ) AS max_revenue FROM revenue_per_shop; I’m using the MAX() function as a window function. Yet again, I’m using it with the column revenue_amount. There is no PARTITION BY in the OVER() clause because I’m not interested in separating data on any level. Defining the range is not that difficult: RANGE BETWEEN INTERVAL '3' DAY PRECEDING AND INTERVAL '1' DAY FOLLOWING. This will include the current date, three days before it, and one day after it. Here’s the result: shopdaterevenue_amountmax_revenue Shop 12021-05-0112,573.2518,847.54 Shop 22021-05-0111,348.2218,847.54 Shop 12021-05-0214,388.1418,847.54 Shop 22021-05-0218,847.5418,847.54 Shop 12021-05-039,845.29 18,847.54 Shop 22021-05-0314,574.5618,847.54 Shop 12021-05-0411,500.6318,847.54 Shop 22021-05-0416,897.2118,847.54 Shop 12021-05-059,634.56 21,489.22 Shop 22021-05-0514,255.8721,489.22 Shop 12021-05-0611,248.3321,489.22 Shop 22021-05-0621,489.2221,489.22 Shop 22021-05-0715,517.2221,489.22 Shop 12021-05-0714,448.6521,489.22 Shop 22021-05-0812,500.0021,489.22 Shop 12021-05-086,874.23 21,489.22 Shop 22021-05-0915,321.8922,222.22 Shop 12021-05-099,784.33 22,222.22 Shop 12021-05-1012,235.5022,222.22 Shop 22021-05-1022,222.2222,222.22 Shop 12021-05-1114,800.6522,222.22 Shop 22021-05-115,894.12 22,222.22 Shop 22021-05-129,966.66 22,222.22 Shop 12021-05-1218,845.6922,222.22 Shop 12021-05-1313,250.6922,222.22 Shop 22021-05-134,987.56 22,222.22 Shop 12021-05-1417,784.2519,874.26 Shop 22021-05-1412,567.4519,874.26 Shop 22021-05-1515,489.3619,874.26 Shop 12021-05-1519,874.2619,874.26 Let's check the result for 2021-05-05 – marked in pink. The range is marked in yellow. To get the maximum revenue in that range, SQL will compare values: 14,388.14, 18,847.54, 9,845.29, 14,574.56, 11,500.63, 16,897.21, 9,634.56, 14,255.87, 11,248.33, 21,489.22. Which one is the highest? It's 21,489.22. After learning how to find maximum value using the RANGE clause, I've reached the maximum number of examples intended for this article. If you want more window functions examples, you can always read this article. RANGE Really Does Have Quite a Range of Uses, Doesn't it? I think these five examples show you a good range of the RANGE clause's possibilities. It's not all you can do with it. Its use depends on the data you have and probably a little bit of imagination. This clause is not talked about very often, which is a shame. I advise everybody to learn it. It could save you time spent creating workaround solutions. The RANGE clause is elegant and really not very complicated. Where can you learn about RANGE? In our Window Functions course, of course. You'll not only learn the RANGE clause, but every aspect of the window functions. Or you can use the course to practice your SQL skills. If you want to find out more, this article will tell you everything about the course and what it offers. Tags: sql learn sql Window Functions