How to Use the SQL RANK OVER (PARTITION BY)
Ranking data in SQL is a breeze if you know how to use RANK() to rank over a partition. This article shows you how to do it, as well as how RANK() differs from DENSE_RANK() and ROW_NUMBER().
If you work with SQL in a professional setting, you have had to rank data at least once. Think of ranking months/years/quarters by revenue or costs, best-selling products, most viewed posts or streamed songs, employees by salary, most profitable branches, you name it.
Or ranking books by their sales.
The thing is, you often have to rank data within a certain category, or partition as we call it. The SQL RANK OVER (PARTITION BY)
enters the game!
This is one typical example of window functions in SQL. For a more detailed explanation of window functions, our window functions course is the best. Through 218 interactive exercises, you learn more about PARTITION BY and other window function clauses, such as ORDER BY
, ROWS
, and RANGE
. These functions are used not only in ranking but also in aggregate and analytics window functions.
Let’s get back to book sales. Take a look at this table, with the code for creating it here. How do we rank book sales within each language?
id | title | author | original_language | sales | classify_under |
---|---|---|---|---|---|
1 | The Hobbit | J. R. R. Tolkien | English | 100 | Fantasy |
2 | Watership Down | Richard Adams | English | 50 | Fantasy |
3 | Harry Potter and the Philosopher's Stone | J. K. Rowling | English | 120 | Fantasy |
4 | The Plague | Albert Camus | French | 12 | Classics |
5 | The Divine Comedy | Dante Alighieri | Italian | 12 | Poetry |
6 | War and Peace | Leo Tolstoy | Russian | 36 | Classics |
7 | Nineteen Eighty-Four | George Orwell | English | 30 | Classics |
8 | Andromeda Nebula | Ivan Yefremov | Russian | 20 | Science fiction |
9 | The Little Prince | Antoine de Saint-Exupéry | French | 200 | Kids |
10 | The Stranger | Albert Camus | French | 10 | Classics |
11 | The Adventures of Pinocchio | Carlo Collodi | Italian | 35 | Kids |
12 | The Name of the Rose | Umberto Eco | Italian | 50 | Classics |
13 | One Hundred Years of Solitude | Gabriel García Márquez | Spanish | 50 | Classics |
14 | Don Quixote | Miguel de Cervantes | Spanish | 500 | Classics |
15 | Lolita | Vladimir Nabokov | English | 50 | Classics |
The solution is simple when you know that the language, in this example, acts as something called data partition.
SELECT original_language, title, author, sales, RANK() OVER (PARTITION BY original_language ORDER BY sales DESC ) AS sales_rank FROM books; |
And here it is!
original_language | title | author | sales | sales_rank |
---|---|---|---|---|
English | Harry Potter and the Philosopher's Stone | J. K. Rowling | 120 | 1 |
English | The Hobbit | J. R. R. Tolkien | 100 | 2 |
English | Lolita | Vladimir Nabokov | 50 | 3 |
English | Watership Down | Richard Adams | 50 | 3 |
English | Nineteen Eighty-Four | George Orwell | 30 | 5 |
French | The Little Prince | Antoine de Saint-Exupéry | 200 | 1 |
French | The Plague | Albert Camus | 12 | 2 |
French | The Stranger | Albert Camus | 10 | 3 |
Italian | The Name of the Rose | Umberto Eco | 50 | 1 |
Italian | The Adventures of Pinocchio | Carlo Collodi | 35 | 2 |
Italian | The Divine Comedy | Dante Alighieri | 12 | 3 |
Russian | War and Peace | Leo Tolstoy | 36 | 1 |
Russian | Andromeda Nebula | Ivan Yefremov | 20 | 2 |
Spanish | Don Quixote | Miguel de Cervantes | 500 | 1 |
Spanish | One Hundred Years of Solitude | Gabriel García Márquez | 50 | 2 |
Now, let’s dissect this example.
What Is RANK()?
In simplest terms, RANK()
is a window function.
The SQL window functions are similar to aggregate functions in that they are applied to a group of rows. One big difference: window functions keep details of the individual rows, unlike aggregate functions with GROUP BY
.
What Does RANK() Do?
As the name suggests, it ranks data. This makes it a ranking window function, along with DENSE_RANK()
and ROW_NUMBER()
.
When using any of these window functions, it must be accompanied by an OVER (ORDER BY)
clause. The OVER()
clause is mandatory for any window function. It’s what turns a “regular” function into a window function.
For these ranking functions, the ORDER BY
in parentheses defines the order in which ranking is done. This order may be ascending or descending. Keep in mind that it does not affect the order of the rows in the result; that is done with an ORDER BY
at the end of the query.
Learn more about this in our article about the RANK() window function.
How RANK() Works With OVER (ORDER BY)
Let’s use the dataset above. Take the whole table and rank the books by sales. Let’s see what happens.
SELECT title, author, sales, RANK() OVER ( ORDER BY sales DESC ) AS sales_rank FROM books; |
This is the same as the previous query, except this one does not select the language and does not use PARTITION BY
. So, the RANK()
function is followed by OVER()
. The ORDER BY
clause in it tells the function to rank the data by sales in descending order, i.e., from the highest- to the lowest-selling books. Since the PARTITION BY
clause is omitted, the function ranks the whole table.
Here are the first ten rows of the output.
title | author | sales | sales_rank |
---|---|---|---|
Don Quixote | Miguel de Cervantes | 500 | 1 |
The Little Prince | Antoine de Saint-Exupéry | 200 | 2 |
Harry Potter and the Philosopher's Stone | J. K. Rowling | 120 | 3 |
The Hobbit | J. R. R. Tolkien | 100 | 4 |
Lolita | Vladimir Nabokov | 50 | 5 |
Watership Down | Richard Adams | 50 | 5 |
The Name of the Rose | Umberto Eco | 50 | 5 |
One Hundred Years of Solitude | Gabriel García Márquez | 50 | 5 |
War and Peace | Leo Tolstoy | 36 | 9 |
The Adventures of Pinocchio | Carlo Collodi | 35 | 10 |
The critical thing to notice here: there are four books with 50 million copies sold, and all are ranked fifth.
This is how RANK()
differs from the other two ranking window functions – it gives the same rank to tied values. When the function reaches the next sales value (in this case, 36 million copies sold), it does not assign the next consecutive rank value (6) but rather skips to adjust for the count of tied sales values. As mentioned, the rank “5” appears four times; therefore, the next rank assigned is nine.
DENSE_RANK()
also ranks the ties with the same ranking. However, unlike RANK()
, it does not skip rank values based on ties. The same data ranked with DENSE_RANK()
is as follows.
title | author | sales | sales_rank |
---|---|---|---|
Don Quixote | Miguel de Cervantes | 500 | 1 |
The Little Prince | Antoine de Saint-Exupéry | 200 | 2 |
Harry Potter and the Philosopher's Stone | J. K. Rowling | 120 | 3 |
The Hobbit | J. R. R. Tolkien | 100 | 4 |
Lolita | Vladimir Nabokov | 50 | 5 |
Watership Down | Richard Adams | 50 | 5 |
The Name of the Rose | Umberto Eco | 50 | 5 |
One Hundred Years of Solitude | Gabriel García Márquez | 50 | 5 |
War and Peace | Leo Tolstoy | 36 | 6 |
The Adventures of Pinocchio | Carlo Collodi | 35 | 7 |
After several books ranked fifth, the next rank is sixth, not ninth as with RANK()
.
What about ROW_NUMBER()
? It does not care about ties or skipping. It just ranks rows sequentially. The first ten rows using ROW_NUMBER()
are below.
title | author | sales | sales_rank |
---|---|---|---|
Don Quixote | Miguel de Cervantes | 500 | 1 |
The Little Prince | Antoine de Saint-Exupéry | 200 | 2 |
Harry Potter and the Philosopher's Stone | J. K. Rowling | 120 | 3 |
The Hobbit | J. R. R. Tolkien | 100 | 4 |
Lolita | Vladimir Nabokov | 50 | 5 |
Watership Down | Richard Adams | 50 | 6 |
The Name of the Rose | Umberto Eco | 50 | 7 |
One Hundred Years of Solitude | Gabriel García Márquez | 50 | 8 |
War and Peace | Leo Tolstoy | 36 | 9 |
The Adventures of Pinocchio | Carlo Collodi | 35 | 10 |
The differences between these functions are explained in the ranking functions overview.
How RANK() Works With OVER (PARTITION BY)
The PARTITION BY
clause divides data into partitions or subsets. When used with RANK()
, this means the data is ranked within the partition. When it reaches the second partition, the ranking is reset to start from one.
Let’s look at the query from the beginning of this article again to clarify this.
SELECT original_language, title, author, sales, RANK() OVER (PARTITION BY original_language ORDER BY sales DESC ) AS sales_rank FROM books; |
In this query, RANK()
is used with PARTITION BY
. The partition, in this case, is original_language
. The ranking is done by sales in descending order as specified in the ORDER BY
clause.
The way in which we have written PARTITION BY
and ORDER BY
means that books are ranked by sales but within each language category. Once the function ranks all the books in one language, it restarts as it reaches the second language, and so on.
We see this in the query output.
original_language | title | author | sales | sales_rank |
---|---|---|---|---|
English | Harry Potter and the Philosopher's Stone | J. K. Rowling | 120 | 1 |
English | The Hobbit | J. R. R. Tolkien | 100 | 2 |
English | Lolita | Vladimir Nabokov | 50 | 3 |
English | Watership Down | Richard Adams | 50 | 3 |
English | Nineteen Eighty-Four | George Orwell | 30 | 5 |
French | The Little Prince | Antoine de Saint-Exupéry | 200 | 1 |
French | The Plague | Albert Camus | 12 | 2 |
French | The Stranger | Albert Camus | 10 | 3 |
Italian | The Name of the Rose | Umberto Eco | 50 | 1 |
Italian | The Adventures of Pinocchio | Carlo Collodi | 35 | 2 |
Italian | The Divine Comedy | Dante Alighieri | 12 | 3 |
Russian | War and Peace | Leo Tolstoy | 36 | 1 |
Russian | Andromeda Nebula | Ivan Yefremov | 20 | 2 |
Spanish | Don Quixote | Miguel de Cervantes | 500 | 1 |
Spanish | One Hundred Years of Solitude | Gabriel García Márquez | 50 | 2 |
We have marked each partition with a different color to see the different partitions easily. The highest-selling book in English is J.K. Rowling’s Harry Potter and the Philosopher's Stone. Then it is The Hobbit. Lolita and Watership Down are both ranked third, as RANK()
assigns the same rank to books with the same sales value. Then one rank value is skipped, and Nineteen Eighty-Four is ranked fifth.
The next partition is the French language, and the ranking is restarted. The Little Prince is ranked the best-selling book in French.
We see the same logic works for books in Italian, Russian, and Spanish.
Bonus Example
We have studied these queries to show how RANK() OVER (PARTITION BY)
works. Now, let’s practice!
The following isn’t much different from the first query. You should not have problems applying what you have learned.
There’s a column in the books
table named classify_under
. It specifies the category under which each book is to be placed in the bookstore.
Let’s rank the books by sales for each category.
SELECT classify_under, title, author, sales, RANK() OVER (PARTITION BY classify_under ORDER BY sales DESC ) AS sales_rank FROM books; |
The query selects the column classify_under
instead of original_language
in the first query. All other selected columns are the same.
There’s also a difference in RANK()
. Since we’re ranking by classify_under
, this is the column that must be in the PARTITION BY
clause.
Once again, we want to rank books by sales in descending order. Here’s the ranking:
classify_under | title | author | sales | sales_rank |
---|---|---|---|---|
Classics | Don Quixote | Miguel de Cervantes | 500 | 1 |
Classics | Lolita | Vladimir Nabokov | 50 | 2 |
Classics | The Name of the Rose | Umberto Eco | 50 | 2 |
Classics | One Hundred Years of Solitude | Gabriel García Márquez | 50 | 2 |
Classics | War and Peace | Leo Tolstoy | 36 | 5 |
Classics | Nineteen Eighty-Four | George Orwell | 30 | 6 |
Classics | The Plague | Albert Camus | 12 | 7 |
Classics | The Stranger | Albert Camus | 10 | 8 |
Fantasy | Harry Potter and the Philosopher's Stone | J. K. Rowling | 120 | 1 |
Fantasy | The Hobbit | J. R. R. Tolkien | 100 | 2 |
Fantasy | Watership Down | Richard Adams | 50 | 3 |
Kids | The Little Prince | Antoine de Saint-Exupéry | 200 | 1 |
Kids | The Adventures of Pinocchio | Carlo Collodi | 35 | 2 |
Poetry | The Divine Comedy | Dante Alighieri | 12 | 1 |
Science fiction | Andromeda Nebula | Ivan Yefremov | 20 | 1 |
By pure coincidence, there are five partitions again. In the “Classics” category, Don Quixote is the best-selling book. Then, there are three books ranked second. The ranking sequence is skipped until we get to War and Peace in fifth place. The rest of the Classics are ranked sequentially since there are no more ties.
In other categories, there are no ties. “Poetry” and “Science fiction” have only one book in each category. So, there’s just the first rank.
For more examples with other window functions, go to our article explaining how to use PARTITION BY.
Add RANK() to Your SQL Vocabulary
We have seen the most typical uses of the RANK()
window function. While it requires an OVER (ORDER BY)
, the PARTITION BY
clause unlocks its possibilities. It makes RANK()
a sophisticated tool for ranking data across one or more partitions with ease in your day-to-day work.
To learn more about and practice RANK()
and other (ranking) window functions, use our Window Functions course. You get an even more detailed explanation of ranking and a chance to write plenty of code in our exercises.
Happy ranking!