25th Mar 2020 16 minutes read How to Rank Rows in SQL: A Complete Guide Kamil Bladoszewski window functions Table of Contents SQL Ranking Functions Are Window Functions Basic Ranking Functions RANK() vs. DENSE_RANK() vs. ROW_NUMBER() SQL Ranking Use Cases Rank Over Single Column Rank Over Multiple Columns RANK() With Top 10 Results Ranking by Date Ranking by Month Ranking With GROUP BY Ranking With COUNT() Ranking With SUM() RANK() OVER(PARTITION BY ...)—Single Column RANK() OVER(PARTITION BY ...)—Multiple Columns Remember, Practice Makes Perfect Don’t struggle with SQL ranking functions anymore! This article will guide you through the most common ranking use cases. Rankings are used around us every day. Sports tournaments, the best movies and TV series on Netflix, stores with the cheapest products—these are only a few examples of rankings you might’ve seen recently. Using SQL RANK functions might be hard sometimes. The variety of different constructions is enormous. You can easily get lost in everything there is to learn about ranking. The key is to understand the important concepts and know where to look for additional information. This article will focus on SQL ranking functions and how to use them in different situations. To learn more about ranking functions, check out our interactive Window Functions course. It contains over 200 hands-on exercises that cover ranking functions and other window functions. In this guide, you’ll find the knowledge that will allow you to write many kinds of SQL ranking queries. First, I’ll explain how ranking works. Then, I’ll show a bunch of examples. If you’re looking for specific use cases, you can simply dive into the SQL Ranking Use Cases section. SQL Ranking Functions Are Window Functions Let’s start with the basics. What exactly are the ranking functions in SQL? They’re part of a family of functions called window functions. Window functions use a scope (window), which looks at part of the data to compute the result. Then, it moves to another part of the data and computes the result for that portion. You can get a basic idea of how it works from SQL Window Function Example With Explanations. Basic Ranking Functions The most common (and simple) types of ranking functions are: RANK() DENSE_RANK() ROW_NUMBER() What do they do and how are they different from each other? I’ll briefly explain each of them. Then, I’ll compare the rankings for each of the functions. The RANK() function creates a ranking of the rows based on a provided column. It starts with assigning “1” to the first row in the order and then gives higher numbers to rows lower in the order. If rows have the same value, they’re ranked the same. However, the next spot is shifted accordingly. For example, if two rows are 5th (have the same rank), the next row will be 7th (i.e., 6th doesn’t exist). The DENSE_RANK() function is rather similar. The only difference is that it doesn’t leave gaps in the ranking. Even though more than one row can have the same rank, the next row will get the next rank. For example, if two rows are 5th, the next row will be 6th. The ROW_NUMBER() function is different. If rows have the same value, they all will get consecutive rankings (not the same ranking as with the previous functions). For example, if two rows have the same value (they would both be 5th with the previous functions), ROW_NUMBER() would place them at 5th and 6th. Now, you might wonder: If a few rows have the same value and we’re using ROW_NUMBER(), how can we tell which row will be first, second, etc.? The answer is rather grim: we can’t! It depends on many factors, and you can’t predict which row will get which rank. These are the basic and (probably) most commonly used SQL ranking functions. If you want to learn about other ranking functions, you can read this overview of ranking functions. RANK() vs. DENSE_RANK() vs. ROW_NUMBER() Now that you know the theory of ranking functions, let’s compare the results from these functions. In order to do so, let’s rank some people based on their test scores. Take a look at the data (table exam_result): first_namelast_namepoints EmyrDownes70 DinaMorin70 Evie-MayBoyer80 NoraParkinson60 TrystanOconnor40 ErykMyers90 MarleneDuncan90 MariusPowell90 JoanneGoddard50 RayhanWilliamson90 We will create a query that ranks the rows by the points column using the ranking functions described above: SELECT RANK() OVER(ORDER BY points DESC) AS rank, DENSE_RANK() OVER(ORDER BY points DESC) AS dense_rank, ROW_NUMBER() OVER(ORDER BY points DESC) AS row_number, first_name, last_name, points FROM exam_result; Let’s analyze the code before we run it to see the results. The first column we want to show is rank. It’ll simply be the ranking created using the RANK() function. However, what does OVER(ORDER BY points DESC) mean? It’s part of the window function concept. OVER() is a compulsory part of all ranking functions (actually, all window functions). It tells the function about the data scope (window) and the order in which the rows will be placed. In this case, we only provided the order. You can find an example of defining a window in the RANK() OVER(PARTITION BY ...)—Single Column section. Since you now understand the first column, you should also understand the following two. The only difference is the name of the ranking functions. Also, the last three columns should be self-explanatory, as you’ve already seen the data. Alright, let’s take look at the results: rankdense_rankrow_numberfirst_namelast_namepoints 111MarleneDuncan90 112RayhanWilliamson90 113MariusPowell90 114ErykMyers90 525Evie-MayBoyer80 636EmyrDownes70 637DinaMorin70 848NoraParkinson60 959JoanneGoddard50 10610TrystanOconnor40 Can you spot the differences between each function? For the first few rows, the rank and dense_rank columns look the same. However, row_number looks different from the very beginning. Even though Marlene, Rayhan, Marius, and Eryk have the same amount of points, the ROW_NUMBER() function assigned them distinct values. As mentioned previously, the order in which these people got their rank is nondeterministic. It’s worth noting that you can make the result of ROW_NUMBER() more deterministic. You simply have to add more columns to the order (e.g., ROW_NUMBER() OVER (ORDER BY points DESC, last_name ASC)). This will be further explained in the Rank Over Multiple Columns section. You can also make ROW_NUMBER() even less deterministic! Even though the ORDER BY part is a necessity for RANK() and DENSE_RANK(), ROW_NUMBER() doesn’t require it at all. This is how you can number the rows without any specific order. Another difference can be seen further down in the table. Evie-May is ranked 5th by the RANK() function and 2nd by the DENSE_RANK() function. The former function notes how many rows had the same ranking and the next row is ranked accordingly. The latter function simply looks at the previous value and the next row is given the next ranking. Take a moment to look at the rankings above and make sure you understand the difference between these three SQL ranking functions. It’s important to know which function to use in which cases. If you need some more explanation, check out How to Use Rank Functions in SQL. SQL Ranking Use Cases It’s time to dive into some SQL RANK queries. We’ll start with some simple queries and gradually increase the complexity. The plan goes as follows: Basic ranking: Rank Over Single Column Rank Over Multiple Columns RANK() With Top 10 Results Ranking with dates: Ranking by Date Ranking by Month Ranking with aggregate functions: Ranking With GROUP BY Ranking With COUNT() Ranking With SUM() Using partition by: RANK() OVER(PARTITION BY ...)—Single Column RANK() OVER(PARTITION BY ...)—Multiple Columns Note that even though we’ll mostly use the SQL RANK() function, the examples can be applied to DENSE_RANK() and ROW_NUMBER(). It depends on your needs. Thus, it’s important to understand the differences between these functions. In the examples, we’ll use modified data from the previous example. The exam_date column shows when the person took the exam, and the city column shows in which city the person took the exam. Take a look: first_namelast_nameexam_datecitypoints EmyrDownes2018-12-18San Francisco70 DinaMorin2019-01-17Los Angeles70 Evie-MayBoyer2019-01-23San Francisco80 NoraParkinson2019-02-16San Diego60 TrystanOconnor2019-02-28Los Angeles40 ErykMyers2019-06-07San Francisco90 MarleneDuncan2019-06-13San Diego90 MariusPowell2019-11-13San Diego90 JoanneGoddard2019-12-18San Diego50 MariusWilliamson2020-01-02San Diego90 Rank Over Single Column We’ll start with the simplest use case: creating a ranking based on only one column. We’ll base our ranking on points: SELECT RANK() OVER(ORDER BY points DESC) AS ranking, first_name, last_name, points FROM exam_result; The query is rather simple, like the previous example. It creates a ranking based on the number of points in descending order. We’re using RANK() OVER(ORDER BY ...) to indicate which column should be used for ordering. The results look like this: rankingfirst_namelast_namepoints 1MarleneDuncan90 1MariusWilliamson90 1MariusPowell90 1ErykMyers90 5Evie-MayBoyer80 6EmyrDownes70 6DinaMorin70 8NoraParkinson60 9JoanneGoddard50 10TrystanOconnor40 If you want to create a ranking based on a different column or in a different order (i.e., ascending instead of descending), simply change the column name or change the DESC keyword to the ASC keyword. Rank Over Multiple Columns Using multiple columns for ordering is also simple. You just add the next column name after the comma. If the values in the first column are equal, then the second column is taken into consideration, and so on. Take a look: SELECT RANK() OVER(ORDER BY points DESC, first_name ASC) AS ranking, first_name, last_name, points FROM exam_result; This query is similar to the previous one. We’ve added first_name ASC to the ordering clause. This way, if a few rows have the same value in the first column, the second one is taken into consideration. Note, that if the second column is the same, then ranking is resolved based on the function we’re using (in our case RANK()). Take a look at the result: rankingfirst_namelast_namepoints 1ErykMyers90 2MariusWilliamson90 2MariusPowell90 4MarleneDuncan90 5Evie-MayBoyer80 6DinaMorin70 7EmyrDownes70 8NoraParkinson60 9JoanneGoddard50 10TrystanOconnor40 As you can see, people with the same number of points get different rankings because they have different names. The only exceptions are the two people named Marius. Because they have the same name and the same number of points, they have the same rank. It’s worth mentioning that ordering by a few columns might be useful if you want to make the result of the ROW_NUMBER() function deterministic. Take a look at this query: SELECT ROW_NUMBER() OVER(ORDER BY points DESC, last_name ASC) AS ranking, first_name, last_name, points FROM exam_result; Since everyone in our data has a different last name, we can add it to the ordering clause so that we can predict who gets which rank. Take a look: rankingfirst_namelast_namepoints 1MarleneDuncan90 2ErykMyers90 3MariusPowell90 4MariusWilliamson90 5Evie-MayBoyer80 6EmyrDownes70 7DinaMorin70 8NoraParkinson60 9JoanneGoddard50 10TrystanOconnor40 In this case, the ROW_NUMBER() function works like RANK() and DENSE_RANK() because each row can be placed deterministically. However, note that if there were people with the same last name, the ROW_NUMBER() function would be nondeterministic even though we’re ordering by two columns. RANK() With Top 10 Results We’re often creating rankings to show the top results (e.g., top 10, top 100, etc.). As you might expect, you can show the top results using SQL ranking functions. Such a query is a little bit more complicated though. Take a look: SELECT * FROM ( SELECT RANK() OVER(ORDER BY points DESC) AS ranking, first_name, last_name, points FROM exam_result ) AS a WHERE ranking <= 3; In this query, we use a subquery to compute the ranking, and then, in the main query, we filter the results so that only the top three are shown. The results might surprise you: rankingfirst_namelast_namepoints 1MarleneDuncan90 1MariusWilliamson90 1MariusPowell90 1ErykMyers90 As you can see, the results show four rows, and each of them has the same value in the ranking column. It’s because of our data. Four rows have the same number of points and thus get the same rank in the subquery. To write such a query on your own, put your ranking query inside a subquery, and write filtering around it. For example, to show the 10, change WHERE ranking <= 3 to WHERE ranking <= 10. It’s also worth noting what happens when you use DENSE_RANK() with this data. Take a look at the below query and its results: SELECT * FROM ( SELECT DENSE_RANK() OVER(ORDER BY points DESC) AS ranking, first_name, last_name, points FROM exam_result ) AS a WHERE ranking <= 3; rankingfirst_namelast_namepoints 1MarleneDuncan90 1MariusWilliamson90 1MariusPowell90 1ErykMyers90 2Evie-MayBoyer80 3EmyrDownes70 3DinaMorin70 As you can see, there are a lot of rows in the result! Once again, it’s because of how DENSE_RANK() ranks each row. This example clearly shows that it’s important to understand the difference between each function and to know which to use in a given situation. Ranking by Date SELECT RANK() OVER(ORDER BY exam_date ASC) AS ranking, first_name, last_name, exam_date FROM exam_result; In this case, we’re simply using the exam_date column for ordering. There’s very little difference between ordering by date and by any other column. Take a look at the results: rankingfirst_namelast_nameexam_date 1EmyrDownes2018-12-18 2DinaMorin2019-01-17 3Evie-MayBoyer2019-01-23 4NoraParkinson2019-02-16 5TrystanOconnor2019-02-28 6ErykMyers2019-06-07 7MarleneDuncan2019-06-13 8MariusPowell2019-11-13 9JoanneGoddard2019-12-18 10MariusWilliamson2020-01-02 The important thing to remember is that ASC (ascending) in case of dates means that the oldest will be placed first. In the DESC (descending) order, the newest date will be placed first. Ranking by Month You may also want to order the rows using only a part of the date (e.g., month). It’s not that hard if you know date functions. Take a look at the following query: SELECT RANK() OVER(ORDER BY EXTRACT('year' FROM exam_date) ASC, EXTRACT('month' FROM exam_date) ASC) AS ranking, first_name, last_name, exam_date FROM exam_result; We’re using the EXTRACT() function to extract a certain part of the date. First, we want to order the rows by year and then by month. So, we’re using EXTRACT('year' FROM exam_date) first, then EXTRACT('month' FROM exam_date). Take a look at the results: rankingfirst_namelast_nameexam_date 1EmyrDownes2018-12-18 2Evie-MayBoyer2019-01-23 2DinaMorin2019-01-17 4NoraParkinson2019-02-16 4TrystanOconnor2019-02-28 6MarleneDuncan2019-06-13 6ErykMyers2019-06-07 8MariusPowell2019-11-13 9JoanneGoddard2019-12-18 10MariusWilliamson2020-01-02 EXTRACT() is a pretty simple function. You tell it which date part you want to extract from the date. There are a lot of different date parts you can use. It’s best to consult the documentation of your database management system (DBMS). The most common parts are: 'year' 'month' 'day' 'hour' 'minute' 'second' It’s also worth mentioning that not all the DBMSs support this function. The ones that do include PostgreSQL and MySQL. In SQL Server, this function is called DATEPART(). Date and time operations are a rather difficult topic, which is beyond the scope of this article. If you’d like to know more about the EXTRACT() function and many more, check out the Standard SQL Functions course. It has a whole section dedicated to date and time operations. Ranking With GROUP BY Now, we’ll take a look at how to use rankings with aggregate functions. Even though it sounds scary, with the proper understanding, it's a pretty logical concept. Your database computes the aggregate functions first and then creates a ranking based on the computed values. Take a look at this example with AVG(): SELECT RANK() OVER(ORDER BY AVG(points) DESC) AS ranking, city, AVG(points) AS average_points FROM exam_result GROUP BY city; As you can see, this query isn’t that different from the other queries we’ve seen so far. You can simply use aggregate functions inside ranking functions. The important thing to remember is to use the GROUP BY clause. As mentioned above, the aggregate functions are computed first. This means that with GROUP BY, you can only use aggregate functions or the expressions you’re grouping by inside the ranking function. For example, if you wanted to use another column for ordering so that the rows are ordered by this other column if the average number of points is the same, you’d have to include this other column in the GROUP BY clause. The above query returns the average number of points scored by people from each city. rankingcityaverage_points 1San Francisco80 2San Diego76 3Los Angeles55 Ranking With COUNT() Now, we’ll take a look at an often-used aggregate function—COUNT(): SELECT RANK() OVER(ORDER BY COUNT(*) DESC) AS ranking, city, COUNT(*) AS exam_takers FROM exam_result GROUP BY city; As you can see, this query isn’t much different from the previous query. All of the aggregate functions are used with ranking functions in the same way. The results are as follows: rankingcityexam_takers 1San Diego5 2San Francisco3 3Los Angeles2 The query computes the number of people who took the exam in each city and then creates a ranking based on this value. Ranking With SUM() Another popular aggregate function is SUM(). It’s also fairly simple to use: SELECT RANK() OVER(ORDER BY SUM(points) DESC) AS ranking, city, SUM(points) AS total_points FROM exam_result GROUP BY city; Once again, we’re simply using the SUM() function with the SQL RANK() function. rankingcitytotal_points 1San Diego380 2San Francisco240 3Los Angeles110 This time, we computed the total number of points acquired by people from each city. RANK() OVER(PARTITION BY ...)—Single Column I hope that you’re ready to tackle a more advanced ranking concept: PARTITION BY. It allows you to create rankings in separate groups. Take a look at this query: SELECT RANK() OVER(PARTITION BY city ORDER BY points DESC) AS ranking, city, first_name, last_name, points FROM exam_result; In this query, we’re using PARTITION BY with a single column to create a ranking of people in each city. This way, we can see the people with the highest scores in each city. Take a look at the results: rankingcityfirst_namelast_namepoints 1San FranciscoErykMyers90 2San FranciscoEvie-MayBoyer80 3San FranciscoEmyrDowes70 1Los AngelesDinaMorin70 2Los AngelesTrystanOconnor40 1San DiegoMarleneDuncan90 1San DiegoMariusPowell90 1San DiegoMariusWilliamson90 4San DiegoNoraParkinson60 5San DiegoJoanneGoddard50 As you can see, the rankings are computed separately for each city. This way, Eryk and Dina get the same ranking, even though they got different scores. Also, Dina and Emyr have the same scores, but Dina is ranked higher because she took the exam in a different city. In some cases, PARTITION BY is an important concept, and thus, it’s worth remembering. The good thing is that it’s also pretty simple to use. However, if you want more information, you might want to read Common SQL Window Functions: Using Partitions With Ranking Functions on the LearnSQL.com blog. RANK() OVER(PARTITION BY ...)—Multiple Columns Finally, let’s look at the PARTITION BY clause with multiple columns. It’s not that different from using PARTITION BY with only one column. Take a look: SELECT RANK() OVER(PARTITION BY city, first_name ORDER BY exam_date ASC) AS ranking, city, first_name, last_name, exam_date FROM exam_result; In the above query, we’re using PARTITION BY with two columns: city and first_name. This means that within each distinct pair of city and first name, we will have separate rankings. Take a look at the results: rankingcityfirst_namelast_nameexam_date 1San FranciscoErykMyers2019-06-07 1San FranciscoEvie-MayBoyer2019-01-23 1San FranciscoEmyrDowes2018-12-18 1Los AngelesDinaMorin2019-01-17 1Los AngelesTrystanOconnor2019-02-28 1San DiegoMarleneDuncan2019-06-13 1San DiegoMariusPowell2019-11-13 2San DiegoMariusWilliamson2020-01-02 1San DiegoNoraParkinson2019-02-16 1San DiegoJoanneGoddard2019-12-18 As you can see above, most of the people are ranked first. This is because most of the pairs (city and first name) are unique. However, there’s one pair that isn’t unique. There are two people from San Diego called Marius. Marius Powell is first because he took the exam earlier than Marius Williamson. Remember, Practice Makes Perfect As you can see, there are numerous use cases for ranking functions in SQL. Therefore, it’s important to know them well—you’ll probably have to write an SQL ranking query sooner or later. The best way to learn about ranking functions (and window functions in general) is through practice. I recommend this Window Functions course. It has 218 interactive exercises, which equals about 20 hours of coding. That's quite a lot, especially if you decide to go about it in one go. We do not recommend this. It is better to spread your studies over several days. Here you will find more tips on how to stay healthy when learning SQL. Take care of your body while developing your career, and start learning SQL today. Tags: window functions