15th May 2020 7 minutes read Overview of Ranking Functions in SQL Dorota Wdzięczna window functions Table of Contents ROW_NUMBER() Example 1 Example 2 RANK() Example 1 Example 2 DENSE_RANK() Example 1 Example 2 PERCENT_RANK() Example 1 Example 2 Summary SQL ranking functions make working with relational databases easier, especially for data analysts, marketers, and financial specialists. These functions are used to assign a ranking number for each record and allow you to efficiently create useful reports. SQL ranking functions are window functions. Window functions compute the result based on a set of rows. The word “window” refers to this set of rows. Let’s look at the syntax of ranking functions: rank_function OVER ( [PARTITION BY partition_expression, ... ] ORDER BY order_expression [ASC | DESC] ... ) The syntax starts with the name of the ranking function, like RANK(), DENSE_RANK(), ROW_NUMBER(), or PERCENT_RANK(), and the OVER() clause. In the OVER() clause, you specify the PARTITION BY and ORDER BY clauses. For ranking functions, the ORDER BY clause, including the name(s) of the column(s) or an expression, is mandatory. Before the ORDER BY clause is the optional PARTITION BY clause, which includes the name(s) of the column(s) or an expression. The PARTITION BY clause divides the set of rows into groups of rows for separate rankings. Don’t worry if this syntax seems complicated. I will explain it step-by-step in the following sections. The examples in the following sections will use the table sale, which stores data in the columns salesman_id, sale_date, and sale_amount. See the table sale below: salesman_idsale_datesale_amount 112020-04-2012500.00 122020-04-2012500.00 132020-04-2211000.00 112020-04-2211000.00 122020-04-2222800.00 122020-04-219500.00 112020-04-2131000.00 ROW_NUMBER() The first ranking function I will discuss is ROW_NUMBER(). It returns the sequential number of each record in the result set or within the partition of the result set, starting with 1. Using ROW_NUMBER(), you can select all records and number them. See Example 1 below. Example 1 SELECT ROW_NUMBER() OVER(ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: row_numbersale_datesalesman_idsale_amount 12020-04-21129500.00 22020-04-221311000.00 32020-04-221111000.00 42020-04-201112500.00 62020-04-201212500.00 72020-04-221222800.00 82020-04-211131000.00 In this case, the OVER clause contains only the ORDER BY clause with the column sale_amount (this clause sorts the rows according to the amount of the sale ascending from $9,500 to $31,000). The query returns the sequential number starting from 1 in the column row_number. Notice that rows that have the same value in the sale_amount column receive different numbers. So, if you want to rank all records with a unique number, use ROW_NUMBER(). What if you would like to number separate groups of rows? You can use the optional PARTITION BY clause before the ORDER BY clause. See Example 2 below. Example 2 SELECT ROW_NUMBER() OVER(PARTITION BY sale_date ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; The records are divided into groups (called “partitions”) by the sale date. Within each partition, the records are numbered separately. This query returns the result: row_numbersale_datesalesman_idsale_amount 12020-04-201212500.00 22020-04-201112500.00 12020-04-21129500.00 22020-04-211131000.00 12020-04-221111000.00 22020-04-221311000.00 32020-04-221222800.00 For the sale date 2020-04-20, the rows have numbers 1 and 2, even though the sale amount values are the same. In the next partition, the row with the lower sale amount value has the number 1, and the row with the higher sale amount value has the number 2. The records within each partition are sorted according to the column in ORDER BY. RANK() The second ranking function is RANK(). This function adds a rank number, which is a sequential number, to each row in the result set or within the partition of the result set. The difference between RANK() and ROW_NUMBER() is that RANK() skips duplicate values. When there are duplicate values, the same ranking is assigned, and a gap appears in the sequence for each duplicate ranking. See Example 1 below. Example 1 SELECT RANK() OVER(ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: rank_numbersale_datesalesman_idsale_amount 12020-04-21129500.00 22020-04-221311000.00 22020-04-221111000.00 42020-04-201112500.00 42020-04-201212500.00 62020-04-221222800.00 72020-04-211131000.00 In this case, RANK() assigns a rank number for each record like ROW_NUMBER(), but for the same value in sale_amount, the rank number is the same. Salesmen 11 and 13 on 2020-04-22 achieved the same sale amount of $11,000. Therefore, they have the same rank number, 2. In this case, ROW_NUMBER() assigned a different rank number. Notice that the next record doesn’t have the number 3. RANK() skips the rank number(s) of the additional rows with the same value. So, after two rows with rank number 2, the next rank number is 4, not 3. Of course, RANK() also assigns numbers within partitions. See Example 2 below. Example 2 SELECT RANK() OVER(PARTITION BY sale_date ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: rank_numbersale_datesalesman_idsale_amount 12020-04-201212500.00 12020-04-201112500.00 12020-04-21129500.00 22020-04-211131000.00 12020-04-221311000.00 12020-04-221111000.00 32020-04-221222800.00 The query above divided the result set into sets of rows with the same sale date. For example, one partition contains the sales on 2020-04-22. The rows are numbered separately for each sale date. On 2020-04-22, salesmen 11 and 13 have the same sale amount of $11000. Therefore, these records both have the rank number 1, and the next record has the rank number 3 because the additional row is skipped. DENSE_RANK() The third ranking function is DENSE_RANK(). If you would like to assign the same number to rows with the same value in a given column but not skip the next numbers, use DENSE_RANK(). DENSE_RANK() is similar to RANK(), but with DENSE_RANK(), the ranking number is not skipped for the same values. See Example 1 below. Example 1 SELECT DENSE_RANK() OVER(ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: dense_rank_numbersale_datesalesman_idsale_amount 12020-04-21129500.00 22020-04-221311000.00 22020-04-221111000.00 32020-04-201112500.00 32020-04-201212500.00 42020-04-221222800.00 52020-04-211131000.00 For the same sale amount, the rows have the same number. However, the next rows aren’t skipped, and they have the next sequential number. Notice that for the same sale amount of $11,000 by salesmen 11 and 13 on 2020-04-22, the assigned rank number is 2, but for the next two records with the sale amount of $12,500, the rank number is 3. This function doesn’t skip the next number. RANK() works differently. In this case, for the sale amount of $12,500, RANK() would assign the rank number 4, skipping 3 because two rows had 2. DENSE_RANK() also works with partitions. See Example 2 below. Example 2 SELECT DENSE_RANK() OVER(PARTITION BY sale_date ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: dense_rank_numbersale_datesalesman_idsale_amount 12020-04-201212500.00 12020-04-201112500.00 12020-04-21129500.00 22020-04-211131000.00 12020-04-221311000.00 12020-04-221111000.00 22020-04-221222800.00 In this case, on 2020-04-22, the salesmen with the sale amount if $11,000 have the rank number of 1, but the next record has the rank number 2, not 3 like with RANK(). PERCENT_RANK() The last ranking function I will discuss is PERCENT_RANK(). This function returns the percentage ranks. See Example 1 below. Example 1 SELECT PERCENT_RANK() OVER(ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: percent_rank_numbersale_datesalesman_idsale_amount 02020-04-21129500.00 0.16666662020-04-221311000.00 0.16666662020-04-221111000.00 0.52020-04-201112500.00 0.52020-04-201212500.00 0.83333342020-04-221222800.00 12020-04-211131000.00 This query calculates the relative rank of each row in the result set. The highest sale amount value is assigned 1 as the percentage rank, and the lowest value is assigned 0. The values between them are returned as the rank from a range of values, which are greater than 0 and less than 1. Halfway between the highest and lowest values, the percent rank number is 0.5. Here, salesmen 11 and 12 on 2020-04-20 are assigned the percent rank number 0.5. The records of salesmen 11 and 13 on 2020-04-22 are between 0 and 0.5, so they have the percent rank number 0.1666666. PERCENT_RANK() works similarly for partitions of records. See Example 2 below. Example 2 SELECT PERCENT_RANK() OVER(PARTITION BY sale_date ORDER BY sale_amount) AS row_number, sale_date, salesman_id, sale_amount FROM sale; This query returns the result: percent_rank_numbersale_datesalesman_idsale_amount 02020-04-201212500.00 02020-04-201112500.00 02020-04-21129500.00 12020-04-211131000.00 02020-04-221311000.00 02020-04-221111000.00 12020-04-221222800.00 In each partition, the highest returned row has percent rank 1, and the lowest has 0. In these partitions, there are no records between the highest and lowest. Therefore, there is no percent rank number between 0 and 1 like in the previous example. Summary In this article, I discussed the SQL ranking functions, explaining their syntax and using them in real-world examples. If you would like to learn more about ranking functions, check out the course “Window Functions” on LearnSQL.com or read the articles “SQL Window Function Example With Explanations,” “How to Use Rank Rows in SQL,” and “Common SQL Window Functions: Using Partitions With Ranking Functions.” Tags: window functions