Back to articles list Articles Cookbook
22 minutes read

What Are SQL Window Functions?

Need to up your data analysis game? Learn these SQL window functions here and you’ll take your analysis skills to the next level. Practice exercises and detailed explanations included!

When I first heard about SQL window functions, I thought it was some strange marriage between SQL and Windows. I was wrong. The SQL window functions have nothing to do with that famous operating system. They are SQL functions that do computations on the set of rows related to the current row. This set of rows is called a window or window frame – hence the function name.

You might also hear about windowing functions in SQL, analytical functions, or the OVER() functions. These are all just alternative names for SQL window functions – an extremely helpful set of tools for data analysis.

In this article, we’ll show you what you can do with window functions and how. I’ll start with a short history of SQL window functions and explain why they are called that. Then, I’ll guide you through the syntax and show you how it works with several examples. After practicing syntax, we’ll be ready for real-life window function examples from a data analyst’s life. And here’s the juiciest part: after each example, there’s an exercise for you to solve and learn through coding.

However, the main source of your knowledge on this subject should be our Window Functions course. Its 218 interactive exercises cover SQL window functions in detail. In other words, you’ll learn about window frames and the OVER(), PARTITION BY, and ORDER BY clauses. All this is necessary to aggregate, rank, and analyze data using window functions. 

A Brief History of SQL Window Functions

Window functions were first introduced in the Oracle8i Database, which was released in 1998. However, they were included in the SQL standard five years later with SQL:2003.

Then Microsoft included them in SQL Server 2005. Other database management systems (DBMS) followed; PostgreSQL has supported them since PostgreSQL 8.4 was released in 2009; MariaDB included them with the 10.2 version (2016), and MySQL added them to version 8 in 2018.

Window functions are a rather new feature in SQL. Because of that, they are not part of the usual SQL curriculum. By learning them, you’ll be ahead of the curve compared to many SQL users.

What’s a Window?

A set of rows related to the current row is called a window or a window frame. Hence, the name of these functions: their result is based on a sliding window frame.

For example, you can calculate a cumulative sum as shown below:

datesalescumulative_sum
2023-10-014,2414,241
2023-10-022,3896,630
2023-10-031,5808,210
2023-10-043,39511,605
2023-10-051,26512,870

The window for the 2023-10-04 cumulative sum is highlighted in green. It includes the current row (for 2023-10-04) and all the previous rows. So the cumulative sum is calculated as the sum of all the previous and current sales: 4,241 + 2,389 + 1,580 + 3,395 = 11,605. (Note that the row outlined in red dots is not included in the window or the sum.)

When we move to the next row, the window also moves: it will now include all the previous rows (green) and the current row (red dotted). Now the cumulative sum is 4,241 + 2,389 + 1,580 + 3,395 + 1,265 = 12,870.

So the window is the set of rows related to the current row that are used in computations for this row. The window changes (slides) as we move across the rows; thanks to these images of a sliding window, we get the name of these functions.

SQL Window Function Syntax

The syntax for window functions is:

SELECT column_1,
       column_2,
	 <window_function> OVER(PARTITION BY … ORDER BY … ) AS column_alias
FROM table;

Here’s what each part does:

  • <window_function> – Specifies the function to apply to that window.
  • OVER() – Defines the window (set of rows) and indicates that this is a window function; without this clause, it’s not a window function.
  • <window_frame> – Defines the window frame size (optional).
  • PARTITION BY – Divides the window into smaller groups called partitions (optional); if omitted, the whole result set is one partition.
  • ORDER BY – Sorts rows within the window frame (optional), i.e., decides in which order the window operation will be performed; if omitted, the order of rows within the partition is arbitrary.

Additional clauses can further define the window. Their syntax is:

[<ROWS or RANGE clause> BETWEEN <lower_bound> AND <upper_bound>]

The ROWS clause defines the window in terms of the fixed number of rows in relation to the current row.

The RANGE clause does the same. But it also takes into calculation all the rows with the same values in the columns specified in the ORDER BY clause as the current row.

The window bounds can be defined as

  • UNBOUNDED PRECEDING – All the rows before the current row.
  • n PRECEDING – A defined number of rows before the current row.
  • CURRENT ROW – Includes the current row.
  • n FOLLOWING – A defined number of rows after the current row.
  • UNBOUNDED FOLLOWING – All the rows after the current row.

Let’s now see how this works in practice.

Dataset and Syntax Examples

We’ll use the table album_catalogue in all these examples. You can create it yourself using this script. A data snapshot is shown below:

idalbum_titlealbum_lengthalbum_genreartistcopies_soldsales_period
1Wednesday Morning, 3 A.M0:31:38FolkSimon & Garfunkel10432022_1Q
2EnRoute: John Scofield Trio LIVE1:13:48JazzJohn Scofield Trio5122022_1Q
3Nasty Gal0:39:15FunkBetty Davis8092022_1Q
4The New Folk Sound of Terry Callier0:37:41FolkTerry Callier9032022_1Q
5In a Silent Way0:38:08JazzMiles Davis4282022_1Q

The dataset is a list of albums with their length, genre, artist, and sales data, including the number of copies sold and the period (quarters). The data goes all the way to the third quarter of 2023.

I’ll first show you several examples, explaining each crucial part of the windowing functions’ syntax along the way.

Syntax Example #1: OVER ()

You can use the SUM() window function with only the OVER() clause to get the total sales in the fourth quarter of 2022:

SELECT sales_period,
 album_title,
	 artist,
	 copies_sold,
	 SUM (copies_sold) OVER() AS sold_in_4Q_2022
FROM album_catalogue
WHERE sales_period = '2022_4Q';

I want the sum of the copies sold, so I specify this column in SUM(). The OVER() clause is mandatory. If you want to use OVER() without any of the optional clauses, just leave the parentheses empty.

I use WHERE to output only data from the desired quarter.

When you write a query like this – with an empty OVER() – the whole result set (selected columns, applied filters, etc.) is taken into account when performing the window function calculations. Here, the result shows the individual sales of every album sold in the fourth quarter of 2022. It also shows the total sales of all albums sold in that period.

sales_periodalbum_titleartistcopies_soldsold_in_4q_2022
2022_4QWednesday Morning, 3 A.MSimon & Garfunkel8097,403
2022_4QEnRoute: John Scofield Trio LIVEJohn Scofield Trio6127,403
2022_4QNasty GalBetty Davis3697,403
2022_4QThe New Folk Sound of Terry CallierTerry Callier2147,403
2022_4QIn a Silent WayMiles Davis657,403
2022_4QCold SweatJames Brown2097,403
2022_4QThe Freewheelin' Bob DylanBob Dylan2467,403
2022_4QMy Favorite ThingsJohn Coltrane3777,403
2022_4QA Whole New ThingSly and the Family Stone8167,403
2022_4QFive Leaves LeftNick Drake4007,403
2022_4QHead HuntersHerbie Hancock4097,403
2022_4QIn the Right PlaceDr. John9127,403
2022_4QBlueJoni Mitchell4127,403
2022_4QConciertoJim Hall6127,403
2022_4QDirty MindPrince9417,403

With the help of SUM() and OVER(), I’m able to show each individual album’s sales and quarterly total.

Syntax Example #2: OVER (ORDER BY)

You can add additional clauses inside the OVER() clause to change the definition of the window frame. One such clause is ORDER BY. The ORDER BY clause defines the sorting of rows within a window frame: the rows can be processed by the window function in a given order.

Let’s see an example. You can calculate the cumulative sum by adding ORDER BY to the previous query. In this example, I want to see how the album ‘In the Right Place’ sells over time and the cumulative number of albums sold up to a given period of time. Here’s the query:

SELECT sales_period,
	 album_title,
	 artist,
	 copies_sold,
	 SUM (copies_sold) OVER(ORDER BY sales_period ASC) AS cumulative_sum
FROM album_catalogue
WHERE album_title = 'In the Right Place';

The same column copies_sold is in SUM(). This time, OVER() contains the ORDER BY clause. You want to show the cumulative sales from the earliest to the latest quarter. That’s why you need sales_period and ASC in ORDER BY.

sales_periodalbum_titleartistcopies_soldcumulative_sum
2022_1QIn the Right PlaceDr. John222222
2022_2QIn the Right PlaceDr. John208430
2022_3QIn the Right PlaceDr. John94524
2022_4QIn the Right PlaceDr. John9121436
2023_1QIn the Right PlaceDr. John9122348
2023_2QIn the Right PlaceDr. John562404
2023_3QIn the Right PlaceDr. John5622966

In each row, you can see the sales for each quarter and the cumulative sum, i.e., the sum of the current and all the previous quarters. For instance, the album sold 94 copies in the third quarter of 2022. The total sales in 2022 up until then (or in three quarters) is: 222 + 208 + 94 = 524.

Syntax Example #3: OVER (PARTITION BY)

Another clause you can use in OVER() is PARTITION BY. PARTITION BY is used to divide the window into smaller segments based on some criteria. For example, you can list the albums, their sales data for the fourth quarter of 2022, and the sales by genre for that quarter:

SELECT album_title,
	 artist,
	 copies_sold,
	 album_genre,
	 SUM (copies_sold) OVER(PARTITION BY album_genre) AS sales_by_genre
FROM album_catalogue
WHERE sales_period = '2022_4Q';

Again, we use the same window function SUM(). This time, though, we use PARTITION BY to divide the window into smaller segments based on album genre. Everything else stays the same.

The query returns the result below. It’s an analysis of album sales data by genre for the last quarter of 2022.

album_titleartistcopies_soldalbum_genresales_by_genre
Wednesday Morning, 3 A.MSimon & Garfunkel809Folk2,081
The Freewheelin' Bob DylanBob Dylan246Folk2,081
Five Leaves LeftNick Drake400Folk2,081
The New Folk Sound of Terry CallierTerry Callier214Folk2,081
BlueJoni Mitchell412Folk2,081
Dirty MindPrince941Funk3,247
Nasty GalBetty Davis369Funk3,247
Cold SweatJames Brown209Funk3,247
A Whole New ThingSly and the Family Stone816Funk3,247
In the Right PlaceDr. John912Funk3,247
Head HuntersHerbie Hancock409Jazz2,075
EnRoute: John Scofield Trio LIVEJohn Scofield Trio612Jazz2,075
In a Silent WayMiles Davis65Jazz2,075
ConciertoJim Hall612Jazz2,075
My Favorite ThingsJohn Coltrane377Jazz2,075

For instance, the cumulative sum for the folk albums is 809 + 246 + 400 + 214 + 412 = 2,081.

Syntax Example #4: OVER (ORDER BY PARTITION BY)

You can also use both PARTITION BY and ORDER BY in OVER(). The rows are divided into segments with PARTITION BY and processed in a given order by ORDER BY.

Using the query below, I can show all the album analytical data and calculate the cumulative sum for each album separately:

SELECT sales_period,
	 album_title,
	 artist,
	 copies_sold,
	 SUM (copies_sold) OVER(PARTITION BY album_title ORDER BY sales_period ASC) AS cumulative_sum_by_album
FROM album_catalogue;

I calculate this with the help of the SUM() window function, as I did earlier. I partitioned the window by album. This means that the sum will be cumulated until the function reaches the last row of a particular album. When it reaches another album, it resets and starts accumulating the sum from the beginning.

I also use ORDER BY to instruct the function to cumulate the sum from the earliest to the latest quarter.

sales_periodalbum_titleartistcopies_soldcumulative_sum_by_album
2022_1QA Whole New ThingSly and the Family Stone674674
2022_2QA Whole New ThingSly and the Family Stone257931
2022_3QA Whole New ThingSly and the Family Stone6661,597
2022_4QA Whole New ThingSly and the Family Stone8162,413
2023_1QA Whole New ThingSly and the Family Stone8163,229
2023_2QA Whole New ThingSly and the Family Stone3023,531
2023_3QA Whole New ThingSly and the Family Stone1233,654
2022_1QBlueJoni Mitchell589589
2022_2QBlueJoni Mitchell184773
2022_3QBlueJoni Mitchell2561,029
2022_4QBlueJoni Mitchell4121,441
2023_1QBlueJoni Mitchell4121,853
2023_2QBlueJoni Mitchell991,952
2023_3QBlueJoni Mitchell9952,947
2022_1QWednesday Morning, 3 A.MSimon & Garfunkel1,0431,043
2022_2QWednesday Morning, 3 A.MSimon & Garfunkel4371,480
2022_3QWednesday Morning, 3 A.MSimon & Garfunkel1841,664
2022_4QWednesday Morning, 3 A.MSimon & Garfunkel8092,473
2023_1QWednesday Morning, 3 A.MSimon & Garfunkel8093,282
2023_2QWednesday Morning, 3 A.MSimon & Garfunkel3253,607
2023_3QWednesday Morning, 3 A.MSimon & Garfunkel6124,219

You can see that the cumulative sum for ‘A Whole New Thing’  is 3,654. The next album (‘Blue’)  starts with the next row, so the cumulation is reset: the cumulative sum is the same as the individual sales of the album in the first quarter of 2022. Then, it accumulates until it reaches the next album. The result goes all the way until the last album, which is ‘Wednesday Morning, 3 A.M.’ in our case.

I’ve shown you the most common ways of defining window frames with these examples. But these are not the only ways. You can also use the ROW or RANGE clauses with the syntax and bounds we explained earlier. Don’t worry. You’ll see the practical use of this in real-world examples.

I used only one function, SUM(), throughout these examples. This is one of many window functions; let’s quickly review some of the others.

What Are the Most Common Window Functions?

The most common window functions can be divided into three categories:

  1. Aggregate Window Functions:
  • COUNT() – Counts the number of rows within a window.
  • SUM() – Totals given values within a window.
  • AVG() – Calculate the average of given values within a window.
  • MIN() – Finds the smallest value within a window.
  • MAX() – Finds the largest value within a window.
  1. Ranking Window Functions:
  • ROW_NUMBER() – Ranks values sequentially, with different ranks for the tied values.
  • RANK() – Ranks values using the same rank for tied values; skips the next rank after the ties (e.g. 1, 2, 2, 4).
  • DENSE_RANK() – Ranks values using the same rank for tied values; doesn’t skip the next rank after the ties (e.g. 1,2,2,3,4).
  1. Analytic Window Functions:
  • LEAD() – Gets data from a defined offset (i.e. a stated number of rows) after the current row.
  • LAG() – Gets data from a defined offset (i.e. a stated number of rows) before the current row.

There are more window functions you could find useful. Check them out in our free SQL Window Functions Cheat Sheet.

Real-World SQL Window Function Examples

So far, I’ve focused more on the syntax of SQL window functions. I’ll now show you the most common practical uses of window functions and how they can help data analysts in their work.

These examples will use the same dataset as earlier.

Example #1: Percentage of Total

Let’s show info about each album and its sales in the first quarter of 2023. Additionally, we will show quarterly sales by each genre. Then, let’s calculate how much each album (as a percentage) contributes to genre sales.

SELECT album_title,
	 artist,
	 copies_sold,
	 album_genre,
	 SUM(copies_sold) OVER (PARTITION BY album_genre) AS sales_by_genre,
	 (copies_sold*1.0/ SUM(copies_sold) OVER (PARTITION BY album_genre))*100.0 AS percent_of_genre_sales
FROM album_catalogue
WHERE sales_period = '2023_1Q'
ORDER BY album_genre, copies_sold DESC;

To get the sales by genre, I again use the SUM() window function. In the OVER() clause, I use only PARTITION BY. That way, I can partition the window by the album genre.

In the next code line, I divide the copies sold (of each album) and divide by the genre sales. To do that, simply copy the calculation from the previous line. Then, multiply the quotient by 100 to get the percentage. You’ll notice that I also multiplied copies_sold with 1.0. This is for converting integers to decimal values.

Filter the desired quarter using WHERE. Finally, order the output alphabetically by genre and then descendingly by copies sold.

Here’s the result:

album_titleartistcopies_soldalbum_genresales_by_genrepercent_of_genre_sales
Wednesday Morning, 3 A.MSimon & Garfunkel809Folk2,08138.88
BlueJoni Mitchell412Folk2,08119.80
Five Leaves LeftNick Drake400Folk2,08119.22
The Freewheelin' Bob DylanBob Dylan246Folk2,08111.82
The New Folk Sound of Terry CallierTerry Callier214Folk2,08110.28
Dirty MindPrince941Funk3,24728.98
In the Right PlaceDr. John912Funk3,24728.09
A Whole New ThingSly and the Family Stone816Funk3,24725.13
Nasty GalBetty Davis369Funk3,24711.36
Cold SweatJames Brown209Funk3,2476.44
EnRoute: John Scofield Trio LIVEJohn Scofield Trio612Jazz2,07529.49
ConciertoJim Hall612Jazz2,07529.49
Head HuntersHerbie Hancock409Jazz2,07519.71
My Favorite ThingsJohn Coltrane377Jazz2,07518.17
In a Silent WayMiles Davis65Jazz2,0753.13

Let’s check the calculation for the first row. The album by Simon & Garfunkel sold 809 copies. The total folk album sales for that quarter were 2,081. So the percentage of the individual sales in total genre sales is 809/2,081*100 = 38.88%.

The sum of percentages for each genre should be 100%. Let’s check this on a folk genre: 38.88% + 19.80% + 19.22% + 11.82% + 10.28 % = 100%.

Window Function vs. Aggregate Function vs. GROUP BY

I’m again using the aggregate function as a window function. I could’ve used a simple aggregate SUM() function with GROUP BY to get sales by each genre for the specified quarter. What’s the difference, then?

A window function allows you to show both analytical and aggregate data (individual sales with sales by genre and the quotient of these values), while an aggregate function used with GROUP BY would collapse the individual row and show only the aggregate value (the sum of sales for the quarter).

Solve This Exercise for Practice

Using window functions, rewrite the above query so that it shows the average sale by genre. Also, show how much each album's sales are above or below the genre average (as a percentage). Show only sales from the third quarter of 2023. Show the album title, artist, copies sold, and the album genre. Sort the output ascendingly by genre and individual album sales.

Solution:

SELECT album_title,
	 artist,
	 copies_sold,
	 album_genre,
	 AVG(copies_sold) OVER (PARTITION BY album_genre) AS average_sales_by_genre,
	   ((copies_sold/AVG(copies_sold) OVER (PARTITION BY album_genre))-1)*100 AS pct_from_average	   
FROM album_catalogue
WHERE sales_period = '2023_3Q'
ORDER BY album_genre, copies_sold;

Output:

album_titleartistcopies_soldalbum_genreaverage_sales_by_genrepct_from_average
The New Folk Sound of Terry CallierTerry Callier283Folk561.6-49.61
Five Leaves LeftNick Drake321Folk561.6-42.84
The Freewheelin' Bob DylanBob Dylan597Folk561.66.30
Wednesday Morning, 3 A.MSimon & Garfunkel612Folk561.68.97
BlueJoni Mitchell995Folk561.677.17
A Whole New ThingSly and the Family Stone123Funk533.4-76.94
Dirty MindPrince169Funk533.4-68.32
In the Right PlaceDr. John562Funk533.45.36
Nasty GalBetty Davis808Funk533.451.48
Cold SweatJames Brown1005Funk533.488.41
ConciertoJim Hall263Jazz464-43.32
My Favorite ThingsJohn Coltrane302Jazz464-34.91
EnRoute: John Scofield Trio LIVEJohn Scofield Trio404Jazz464-12.93
Head HuntersHerbie Hancock542Jazz46416.81
In a Silent WayMiles Davis809Jazz46474.35

Example #2: Rank Data

In this example, I’ll use a window function to rank data. I want to show each distinct album title and its length and rank them by length. The longest album will be ranked first.

SELECT *, 
	 RANK() OVER (ORDER BY album_length DESC) AS album_length_rank
FROM (SELECT DISTINCT album_title,
	       album_length
	FROM album_catalogue) AS distinct_album;

Let’s start by explaining the subquery: we use it to select distinct albums and their lengths.

Then, we use the main query to select all the data from the subquery. Now, use the RANK() window function to rank albums. You can also use other ranking functions, depending on your data and tasks.

For the ranking to work the way you want to, use the ORDER BY clause in OVER(). Specify the column by which you want to rank and in what order. In this case, it’s descendingly by length.

Here’s the ranking:

album_titlealbum_lengthalbum_length_rank
EnRoute: John Scofield Trio LIVE1:13:481
The Freewheelin' Bob Dylan0:44:142
Head Hunters0:41:523
Five Leaves Left0:41:434
My Favorite Things0:40:255
Nasty Gal0:39:156
In a Silent Way0:38:087
Concierto0:38:028
A Whole New Thing0:38:019
The New Folk Sound of Terry Callier0:37:4110
Blue0:36:1511
Cold Sweat0:33:4312
In the Right Place0:33:2213
Wednesday Morning, 3 A.M0:31:3814
Dirty Mind0:30:1415

Solve This Exercise for Practice

Rank each unique album by its sales within its genre. Show only data for the first quarter of 2023. Show the album title, its sales, genre, and rank. If there are albums with the same number of sales, rank them equally and don't skip the next rank.

Solution:

SELECT *, 
	 DENSE_RANK() OVER (PARTITION BY album_genre ORDER BY copies_sold DESC) AS album_sales_rank
FROM (SELECT DISTINCT album_title,
	       copies_sold,
	       album_genre
	FROM album_catalogue
	WHERE sales_period = '2023_1Q') AS distinct_album;

Output:

album_titlecopies_soldalbum_genrealbum_sales_rank
Wednesday Morning, 3 A.M809Folk1
Blue412Folk2
Five Leaves Left400Folk3
The Freewheelin' Bob Dylan246Folk4
The New Folk Sound of Terry Callier214Folk5
Dirty Mind941Funk1
In the Right Place912Funk2
A Whole New Thing816Funk3
Nasty Gal369Funk4
Cold Sweat209Funk5
EnRoute: John Scofield Trio LIVE612Jazz1
Concierto612Jazz1
Head Hunters409Jazz2
My Favorite Things377Jazz3
In a Silent Way65Jazz4

Example #3: Running Total

In this example, I’ll show a particular album’s sales period, title, artist, and the copies sold. I’ll also add a running total of copies sold that will include three rows: the current row and the two previous. The sum should be calculated from the earliest to the latest quarter. 

SELECT sales_period, 
	 album_title,
	 artist,
	 copies_sold,
	 SUM(copies_sold) OVER (ORDER BY sales_period ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS sales_running_total
FROM album_catalogue
WHERE album_title = 'In a Silent Way';

I again use the SUM() window function. Then there’s an ORDER BY in OVER() to order the sales so we can sum them ascendingly.

Next, I need to define the moving window frame. The running total should include the current row and the previous two rows. These are the lower and upper bounds specified in the ROWS clause. The lower bound is two previous rows, i.e., BETWEEN 2 PRECEDING. The upper bound is CURRENT ROW. The two bounds are stitched together in a window frame using the keyword AND.

I want to show the calculation for Miles Davis’s album ‘In a Silent Way’, so I filter data using WHERE.

Here are the running totals:

sales_periodalbum_titleartistcopies_soldsales_running_total
2022_1QIn a Silent WayMiles Davis428428
2022_2QIn a Silent WayMiles Davis1,0531,481
2022_3QIn a Silent WayMiles Davis191,500
2022_4QIn a Silent WayMiles Davis651,137
2023_1QIn a Silent WayMiles Davis65149
2023_2QIn a Silent WayMiles Davis218348
2023_3QIn a Silent WayMiles Davis8091,092

Let’s check the result and explain what a running total is.

A running total is similar to a cumulative total (or sum), but they’re not the same. The cumulative total will give you the sum of the current row and all the previous rows, i.e., the window frame increases with each row. A running total is a sum within a defined window frame that stays the same size but moves with each row. In our case, the window is defined as the current row and the two previous rows.

Take a look at the highlighted values. The running total for the first quarter in 2022 is 428, the same as the individual sale. There are no previous rows, so the running total includes only the current row.

The next running total is 428 + 1,053 = 1,481. It sums the current and the previous row, as there is only one previous row.

The running total for the third quarter in 2022 is  428 + 1,053 + 19 = 1,500. This is the first time you get the whole window, i.e., the current row and the two previous rows.

As you go to the next row, the window will move but its size will remain the same. The running total for the following quarter is 428 + 1,053 + 19 + 65 = 1,137. It, again, involves the current row and the two previous rows – but different ones compared to the quarter.

Solve this exercise for practice

Rewrite the above query so it calculates the running total for the album ‘The New Folk Sound of Terry Callier’. The running total should be calculated from the earliest to the latest quarter. It should include four quarters: the two previous, the current quarter, and the one following. Also, show the sales period, album title, artist, and the number of copies sold.

Solution:

SELECT sales_period, 
	 album_title,
	 artist,
	 copies_sold,
	 SUM(copies_sold) OVER (ORDER BY sales_period ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS sales_running_total
FROM album_catalogue
WHERE album_title = 'The New Folk Sound of Terry Callier';

Output:

sales_periodalbum_titleartistcopies_soldsales_running_total
2022_1QThe New Folk Sound of Terry CallierTerry Callier9032,575
2022_2QThe New Folk Sound of Terry CallierTerry Callier4182,789
2022_3QThe New Folk Sound of Terry CallierTerry Callier1,2543,003
2022_4QThe New Folk Sound of Terry CallierTerry Callier2142,641
2023_1QThe New Folk Sound of Terry CallierTerry Callier2142,506
2023_2QThe New Folk Sound of Terry CallierTerry Callier5411,252
2023_3QThe New Folk Sound of Terry CallierTerry Callier2831,038

Example #4: Quarter-To-Quarter Difference

In this last example, I’ll show how to use the window functions to calculate the sales difference between quarters:

SELECT *,
	 LAG(quarterly_copies_sold) OVER (ORDER BY sales_period) AS previous_quarter_sales,
	 quarterly_copies_sold - LAG(quarterly_copies_sold) OVER (ORDER BY sales_period) AS quarterly_sales_difference
FROM (SELECT sales_period,
	SUM(copies_sold) AS quarterly_copies_sold
FROM album_catalogue
GROUP BY sales_period) AS quarterly_sales; 

I first write a subquery that calculates the total sales for each quarter. I use the SUM() aggregate function and group the results by the sales period.

Next, I select all the data from the subquery in the main query.

Now I need to get the previous quarter's sales. I’ll write the LAG() window function, which is used to access values from the previous rows. The value I want to access is specified in the function. In this case, it’s the quarterly copies sold from the subquery. By defining the offset argument, the function allows me to define how far back I want to go. I didn’t define it, so the default offset is one. In other words, the function will get the data from the previous row/quarter. But if you want to go two rows/quarters back, then you would write LAG(quarterly_copies_sold, 2).

I also use ORDER BY in OVER() to make sure the values within the frame are sorted from the oldest to the latest quarter.

This use of the window function is so that it’s clearer what I’ll do in the next code line. This is where the actual calculation of comparing the current and previous quarter’s sales is done. It’s simple now:  subtract the window function defined above from the column quarterly_copies_sold.

Here’s the output:

sales_periodquarterly_copies_soldprevious_quarter_salesquarterly_sales_difference
2022_1Q9,519NULLNULL
2022_2Q7,5819,519-1,938
2022_3Q4,2737,581-3,308
2022_4Q7,4034,2733,130
2023_1Q7,4037,4030
2023_2Q4,9567,403-2,447
2023_3Q7,7954,9562,839

There are no previous values for 2022_1Q, as there is no previous quarter. The quarterly sales for 2022_2Q are 7,581. The sales in the previous quarter were 9,519. The calculation shows that the current sales are 1,938 copies (7,581 - 9,519) below the previous quarter's sales.

You can analyze the rest of the output the same way.

Solve This Exercise for Practice

Rewrite the above query so it shows the difference between quarterly sales on a year-over-year basis – e.g., compare the first quarter of 2023 with the first quarter of 2022. Show the sales period, copies sold in the quarter, sales for the same quarter in the previous year, and the year-over-year difference between the quarters.

Solution:

SELECT *,
	 LAG(quarterly_copies_sold, 4) OVER (ORDER BY sales_period) AS year_over_year_sales,
	 quarterly_copies_sold - LAG(quarterly_copies_sold, 4) OVER (ORDER BY sales_period) AS year_over_year_difference
FROM (SELECT sales_period,
	   SUM(copies_sold) AS quarterly_copies_sold
FROM album_catalogue
GROUP BY sales_period) AS quarterly_sales; 

Output:

sales_periodquarterly_copies_soldyear_over_year_salesyear_over_year_difference
2022_1Q9,519NULLNULL
2022_2Q7,581NULLNULL
2022_3Q4,273NULLNULL
2022_4Q7,403NULLNULL
2023_1Q7,4039,519-2,116
2023_2Q4,9567,581-2,625
2023_3Q7,7954,2733,522

If you want more, here are additional window function examples. For practice materials, take a look at these 11 SQL window functions exercises.

SQL Window Functions: A Window to Better Data Analysis

This turned out to be a pretty comprehensive article about SQL window functions. You learned the window functions and how each crucial part of their syntax works.

You also know there are several categories of window functions. The most commonly used are aggregate, ranking, and analytical window functions. The practical examples showed you how window functions can be used in common data analysis tasks.

Hopefully, you didn’t skip the exercises in the article. If you did, I once again recommend that you solve them. Only through practice can you really bring home what SQL window functions are about.

The richest resource for learning and practicing is our Window Functions course. It is an interactive course that has over 200 hands-on exercises and covers the full syntax of window functions. If you have job interviews lined up, make sure you go through these SQL window functions interview questions.  Good luck, and keep on learning SQL!