28th Sep 2023 10 minutes read How to Use ROW_NUMBER OVER() in SQL to Rank Data Ignacio L. Bisso window functions Table of Contents What Is the ROW_NUMBER() Function? Using ROW_NUMBER() with OVER(): An Introductory Example Creating Rankings with ROW_NUMBER() and ORDER BY Using ORDER BY Twice in One Query Using ROW_NUMBER() with PARTITION BY and ORDER BY Using ROW_NUMBER() in the WHERE Clause The Oracle ROWNUM Pseudocolumn Ready to Practice ROW_NUMBER() and OVER() in SQL? Sometimes you need to know the position of rows in a result set. Learn how using ROW_NUMBER and OVER in SQL can make it happen! Have you ever needed to add a sequential number to the records returned by an SQL query? Or perhaps you need to create a ‘top n’ report based on a specific ranking. In any of these cases, you need to calculate the position of the row in the ranking. To do this, you need the ROW_NUMBER() function. The function assigns a sequential integer number to any row in the result set. In this article, we will explore how to use the ROW_NUMBER() function in SQL. What Is the ROW_NUMBER() Function? ROW_NUMBER is a window function in SQL. It’s used to add sequential numbers to the rows of a result set. Like any other window function, you need to use it with the OVER() clause. Here’s the syntax: SELECT ROW_NUMBER() OVER (...) as athlete_num … FROM athletes; The OVER() clause has two optional subclauses: PARTITION BY and ORDER BY. We will show examples using several different OVER clauses. Before we start, let’s have a few words about window functions in general. Window functions are a very powerful part of SQL, but they are not widely known to the average SQL user. This is why I recommend our interactive course on window functions. In this step-by-step course, we'll walk you through window functions using 200+ practical exercises. By the end of the course, you’ll feel comfortable using window functions on SQL databases. Using ROW_NUMBER() with OVER(): An Introductory Example Let’s show a simple SQL query using the ROW_NUMBER window function. There’s nothing better than sports to illustrate rankings, so let's suppose that we work for a company that organizes sports competitions in many countries. First, we want to assign a sequential number to each athlete; this number will be used as the athlete’s ID in our company. To avoid conflicts, we do not want there to be any criteria to determine the order of the sequential numbering. We want sequential numbers to be assigned to each athlete randomly, not alphabetically by name, country, or sport. We have a table called athlete with the columns firstname, lastname, sport, and country. The query to generate a report including a sequential number for each athlete is: SELECT ROW_NUMBER() OVER () as athlete_id, firstname lastname, sport, country FROM athletes; The expression ROW_NUMBER() OVER () assigns a sequential integer value starting with 1 to each row in the result set of the query. The order of the numbers assigned to rows in the result is not deterministic if you use the simple OVER() clause. (Note that there are no additional clauses like ORDER BY clause or PARTITION BY in OVER()) The first record can be any record of the table; for this record, ROW_NUMBER will return 1. Then the same for the second record which will be number 2, and so on. Below is a partial result of the query: athlete_idfirstnamelastnamesportcountry 1JohnDoeMarathonUSA 2PaulSmithMarathonCanada 3LeaMcCianLong JumpIreland 4AnthonySmithMarathonCanada 5MarieDareauxLong JumpFrance Before ending this section, I would like to suggest the article What Is The OVER Clause In SQL, where you can find several examples of window functions using different combinations of the OVER clause. Creating Rankings with ROW_NUMBER() and ORDER BY Let’s suppose now that the company needs to create a label with the participant number for all the athletes participating in a marathon. Athletes should be ordered by last name, and the company wants to assign a sequential number to each athlete; athletes will wear these numbers as labels on their shirts during the marathon. The labels must start at 1001. The query is: SELECT ROW_NUMBER() OVER (ORDER BY lastname) + 1000 as participant_label, firstname, lastname, country FROM athletes WHERE sport = 'Marathon'; This query is similar to the previous example. One difference is the WHERE clause, which returns only the athletes participating in the marathon. The other difference (which is the main one) is the clause OVER(ORDER BY lastname). This indicates to ROW_NUMBER() that the sequential number must be assigned in order of the lastname— e.g., 1 to the first lastname, 2 to the second, and so on. participant_labelfirstnamelastnamecountry 1001JohnBarryIreland 1002JohnDoeUSA 1003PaulSmithCanada 1004AnthonySmithCanada In the previous result set, participants were ordered by lastname. However, if two participants have the same last name (i.e., Smith), then the order of these two rows is non-deterministic; the rows can be in any order. If we want to order by both lastname and firstname, we should use the expression: ROW_NUMBER() OVER (ORDER BY lastname, firstname) Using ORDER BY Twice in One Query In the above query, we use the ORDER BY clause in the ROW_NUMBER() function. However, the result of the query does not follow any order—that is, the rows are ordered randomly. If we wanted, we could add a second ORDER BY clause at the end of the query to define the order in which the result records are displayed. Let’s modify the previous query by adding a single change: We will put an ORDER BY country: SELECT ROW_NUMBER() OVER (ORDER BY lastname ASC) + 1000 as participant_label, firstname, lastname, country FROM athletes WHERE sport = 'Marathon' ORDER BY country; The rows in the result below are the same rows as in the previous query, but they are shown in different order. Now they are ordered based on the athlete’s country. However, if two or more athletes are from the same country, they are shown in any order. We can see this below in the two athletes from Canada: participant_labelfirstnamelastnamecountry 1002PaulSmithCanada 1003AnthonySmithCanada 1001JohnBarryIreland 1001JohnDoeUSA In this query, we used the ORDER BY clause twice. The first time was used in the ROW_NUMBER function to assign the sequential number following the lastname order. The second time was used to define the order in which the result rows are shown, which is based on the country name. Using ROW_NUMBER() with PARTITION BY and ORDER BY In the next example query, we will use ROW_NUMBER() combined with the PARTITION BY and ORDER BY clauses. We will show a query to assign room numbers to the athletes. Let’s suppose the company wants to accommodate athletes from the same country in contiguous hotel rooms. The idea is to create a label with the country and a sequential number for each athlete and put this label on the door of each hotel room. For example, If the country is Canada and has 3 athletes we want the room labels ‘Canada_1’, ‘Canada_2’, and ‘Canada_3’. The query to generate the room labels with the name of the athlete assigned to this room is: SELECT country || '_' || ROW_NUMBER() OVER (PARTITION BY country ORDER BY lastname ASC) as room_label, firstname, lastname, country FROM athletes; The new element introduced in the query is OVER(PARTITION BY country). It groups the rows from the same country and generates a different sequential series of numbers (starting from 1) for each country. In the following query result, you can see the rows grouped by the PARTITION BY clause have the same color. One group of rows is for Canada (light blue), another for France (purple), and so on. Inside each group of rows, the ORDER BY lastname clause is used to assign sequential numbers to athletes by last name. For ‘Ireland’, we have three rows; the first one is for ‘Barry’, the second one is for ‘Fox’, and so on. room_labelfirst_namelast_namecountry Canada_1AnthonySmithCanada Canada_2PaulSmithCanada France_1MarieDareauxFrance Ireland_1JohnBarryIreland Ireland_2SeanFoxIreland Ireland_3LeaMcCianIreland USA_1JohnDoeUSA I recommend the article How to Use SQL PARTITION BY with OVER, where you can find more examples of the OVER and PARTITION BY clauses. Other Ranking Window Functions: RANK and DENSE_RANK Apart from ROW_NUMBER, SQL provides two other window functions to calculate rankings: RANK and DENSE_RANK. The RANK function works differently than ROW_NUMBER when there are ties between rows. When there is a tie, RANK assigns the same value to both rows and skips the next rank (e.g., 1, 2, 2, 2, 5 – ranks 3 and 4 are omitted). The DENSE_RANK function does not skip the next rank(s). Let's look at a simple example to see the differences between these three functions: SELECT lastname AS athlete_name, time, ROW_NUMBER() OVER (ORDER BY time) AS position_using_row_number, RANK OVER() (ORDER BY time) AS position_using_rank, DENSE_RANK() OVER (ORDER BY time) AS position_using_dense_rank FROM competition_results WHERE sport = ‘Marathon men’; The results are: athlete_nametimeposition_using_row_numberposition_using_rankposition_using_dense_rank Paul Smith1h 58m 02.56s111 John Doe1h 59m 23.55s222 Anthony Smith1h 59m 23.55s322 Carlos Perez2h 1m 11.22s443 If you are interested in the window functions RANK and DENSE_RANK, I suggest these articles for more details and examples: Overview of Ranking Functions in SQL How to Use RANK Functions What Is The RANK Function in SQL and How Do You Use it Using ROW_NUMBER() in the WHERE Clause In SQL, you can’t use window functions in the WHERE clause. However, in some scenarios, you may need to. In a Top 10 report, for instance, it would be very useful to be able to use a condition like WHERE ROW_NUMBER OVER() <= 10. Although you cannot use ROW_NUMBER() directly in the WHERE, you can do it indirectly through a common table expression, or CTE. For example, suppose we want to obtain the first 3 positions in the marathon and the 100-meter race. First, we write the CTE, which begins with WITH: -- CTE starts WITH positions AS ( SELECT lastname AS athlete_name, sport, country, time, ROW_NUMBER OVER (PARTITION BY sport ORDER BY time) AS position FROM competition_results WHERE sport IN (‘Marathon men’, ‘Marathon women’) ) --CTE ends --main query starts SELECT sport, athlete_name, time, country, position FROM positions WHERE position <= 3 ORDER BY sport, position; In the previous query, we created a CTE called positions. It has a column called position that is populated with the result of the ROW_NUMBER() function. In the main query (i.e., the second SELECT statement), we can use the column position in the WHERE clause to filter those athletes who finish the competition in the first three positions. Note: If we have ties between two competitors, the RANK() function could be more appropriate to use than the ROW_NUMBER() function in this report. The results of the query are shown below: sportathlete_nametimecountryposition Marathon menPaul Smith1h 58m 02.56sCanada1 Marathon menJohn Doe1h 59m 23.55sUSA2 Marathon menAnthony Smith1h 59m 23.55sCanada3 Marathon womenMarie Dareaux2h 14m 11.22sFrance1 Marathon womenZui Ru2h 16m 36.63sKenia2 Marathon womenLea Vier2h 17m 55.87sPeru3 If you want to practice SQL window functions, I recommend our interactive Window Functions Practice Set. It provides 100 hands-on exercises on window functions, including creating rankings using different ranking window functions. The Oracle ROWNUM Pseudocolumn Oracle SQL allows us to put a pseudocolumn called ROWNUM in any query. A pseudocolumn behaves like a table column, but is not actually stored in the table. You can select from a pseudocolumn as if it were a column in the table. The ROWNUM pseudocolumn returns the position of the row in the result set. It starts with 1 for the first row and each of the following records is incremented by 1. However, Oracle ROWNUM does not have the power of the ROW_NUMBER window function. For example, you cannot use the PARTITION BY subclause to create several different sequences as we did in the hotel room query. Another limitation is that you can’t use the ORDER BY clause to specify a different order to the sequence than the order of the result set. The reason for these limitations is simple: ROWNUM is not a window function; it is just a simple pseudocolumn. Ready to Practice ROW_NUMBER() and OVER() in SQL? We’ve covered several ways to add a numeric sequence to the result of a query by using the ROW_NUMBER function. And we’ve shown different ways to use the OVER() clause. We also introduced two more SQL ranking functions: RANK and DENSE_RANK. Windows functions are a powerful resource in SQL. If you want to go deeper, I suggest you take our interactive online Window Functions course. It’s a step-by-step tutorial that takes you through SQL window functions using examples and exercises. I also recommend our free SQL Windows Functions Cheat Sheet, which is my preferred cheat sheet. I have it stuck on the wall of my office to use as a quick help for window function syntax. Tags: window functions