Back to articles list May 21, 2020 - 8 minutes read How to Number Rows in an SQL Result Set Ignacio L. Bisso Ignacio is a database consultant from Buenos Aires, Argentina. He’s worked for 15 years as a database consultant for IT companies like Informix and IBM. These days, he teaches databases at Sarmiento University and works as a PostgreSQL independent SQL consultant. A proud father of four kids with 54 years in his backpack, Ignacio plays soccer every Saturday afternoon, enjoying every match as if it’s his last one. Tags: sql learn sql window functions Have you ever needed to add a sequential number to the records in the result of an SQL query? It’s not as simple as it looks! Find out how to do it correctly in this article. To number rows in a result set, you have to use an SQL window function called ROW_NUMBER(). This function assigns a sequential integer number to each result row. However, it can also be used to number records in different ways, such as by subsets. You can even use it to number records for other interesting purposes, as we will see. A Frequent Request: Could You Number the Records? Suppose you work for a car sales company and you want to produce the following report. Note that the first column (row_num), is not a table column; we generate it using ROW_NUMBER() in the query. row_numArticle_codeArticle_nameBranchUnits_sold 1101Katan 2.3 LuxNew York23 2102Katan 1.8 StdNew York17 3102Katan 1.8 StdSan Francisco18 4101Katan 2.3 LuxSan Francisco15 5103Katan GoldNew York3 Result table The query to obtain the report is: SELECT ROW_NUMBER() OVER () AS row_num, article_code, article_name, branch, units_sold FROM Sales WHERE article_code IN ( 101, 102, 103 ) In the above query, the syntax of the ROW_NUMBER() function is very simple: we use an empty OVER clause. This means that we want to number all the records in the result set using just one sequence of numbers, assigning numbers to records without any order. This is the simplest way to use the ROW_NUMBER() function: ROW_NUMBER() OVER () AS row_num However, there are other ways to use ROW_NUMBER(). We can add a PARTITION BY and/or an ORDER BY clause to the OVER, as we’ll see in the next section. The PARTITION BY clause allows us to number several groups of records independently, while the ORDER BY clause allows us to number the records in specific order. In the next section, we will see some examples. Before continuing with ROW_NUMBER(), we should say a few words about SQL window functions. As we mentioned previously, ROW_NUMBER() is a window function. There are many other window functions that you can use in your queries, like AVG(), MAX(), LEAD(), LAG() and FIRST_VALUE(). If you want to go into details, I suggest LearnSQL’s Window Functions course. It’s a step-by-step tutorial that takes you through SQL window functions using examples and exercises. Going Deeper: The Partition By and Order By Clauses In the previous section, we covered the simplest way to use the ROW_NUMBER() window function, i.e. just numbering all records in the result set in no particular order. In the next paragraphs, we will see three examples with some additional clauses, like PARTITION BY and ORDER BY. In our first example, we will number the records using a different sequence for each company branch, which will be ordered by the units sold at that branch. In the next query, the PARTITION BY branch clause groups the records that have the same value in branch, assigning a different ROW_NUMBER sequence to each group/branch. (Each group has a different color in the image below.) The ORDER BY units_sold clause defines the order in which we process the rows inside the partition. In this case, the rows belonging to each partition will be ordered by unit_sold in descending order. SELECT ROW_NUMBER() OVER (PARTITION BY branch ORDER BY units_sold DESC) AS row_num, article_code, article_name, branch, units_sold FROM Sales WHERE article_code IN ( 101, 102, 103 ) Note that in this query, sequences are assigned per branch – in the image below, each group of records has a different color – and ordered by units_sold. The clause we used is: ROW_NUMBER() OVER (PARTITION BY branch ORDER BY units_sold DESC) This clause allows us to number groups of records that have the same value in the branch column. In our example, there are two groups of records: New York (red) and San Francisco (blue). Now each group of records will be numbered (the ORDER BY clause) based on the units_sold column. The values are shown in descending order. row_numArticle_codeArticle_nameBranchUnits_sold 1101Katan 2.3 LuxNew York23 2102Katan 1.8 StdNew York17 3103Katan GoldNew York3 1102Katan 1.8 StdSan Francisco18 2101Katan 2.3 LuxSan Francisco15 Result table The OVER, PARTITION BY, and ORDER BY clauses are very common in window functions; if you want to go into details, I suggest you read the article Window Function Examples, where you will find several window functions examples explained in detail. Using ROW_NUMBER to Remove Duplicates Another interesting use case for the ROW_NUMBER() function is when we have full duplicated records in a table. Full duplicated records happen when the table has more than one record with the same values in all their columns (usually due to a previous fail). We will show some SQL code to fix this situation; moreover, this code can be adapted to any case of full duplicated records. First of all, let’s insert some full duplicated records into the Sales table. Suppose we don’t have a primary key in the Sales table and an SQL developer mistakenly executes the following INSERT statement: INSERT INTO sales SELECT * FROM sales WHERE branch = 'San Francisco'; After the INSERT execution, the table Sales looks like this. The last two rows are full duplicates: Article_codeArticle_nameBranchUnits_soldPeriod 101Katan 2.3 LuxNew York23Q1-2020 102Katan 1.8 StdNew York17Q1-2020 102Katan 1.8 StdSan Francisco18Q1-2020 101Katan 2.3 LuxSan Francisco15Q1-2020 103Katan GoldNew York3Q1-2020 102Katan 1.8 StdSan Francisco18Q1-2020 101Katan 2.3 LuxSan Francisco15Q1-2020 Table: Sales To remove the duplicate records, we will add a new column called row_num and we’ll populate it with the following INSERT that uses the ROW_NUMBER() function. Note that we PARTITION BY all columns in the table. Here is the SQL code: ALTER TABLE sales ADD COLUMN row_num INTEGER; INSERT INTO sales SELECT article_code, article_name, branch, units_sold, period, ROW_NUMBER() OVER (PARTITION BY article_code,article_name,branch, units_sold,period) FROM sales ; Then, after adding the new column and populating it with ROW_NUMBER(), our table looks like this: Article_codeArticle_nameBranchUnits_soldPeriodrow_num 101Katan 2.3 LuxNew York23Q1-2020NULL 102Katan 1.8 StdNew York17Q1-2020NULL 102Katan 1.8 StdSan Francisco18Q1-2020NULL 101Katan 2.3 LuxSan Francisco15Q1-2020NULL 103Katan GoldNew York3Q1-2020NULL 102Katan 1.8 StdSan Francisco18Q1-2020NULL 101Katan 2.3 LuxSan Francisco15Q1-2020NULL 101Katan 2.3 LuxNew York23Q1-20201 102Katan 1.8 StdNew York17Q1-20201 102Katan 1.8 StdSan Francisco18Q1-20201 101Katan 1.8 LuxSan Francisco15Q1-20201 103Katan GoldNew York3Q1-20201 102Katan 1.8 StdSan Francisco18Q1-20202 101Katan 2.3 LuxSan Francisco15Q1-20202 Table: Sales It’s easy to see we need to remove all records with a NULL or a 2 in the column row_num. Let’s do it with the DELETE command. After that, we need to remove the column row_num. Here’s the code: DELETE FROM sales WHERE row_rank IS NULL OR row_rank = 2; ALTER TABLE sales DROP COLUMN row_rank; After running the DELETE and ALTER statements, the table Sales is fixed, with no duplicate records. There is a disadvantage with this approach that we must clarify. At some point in the process, the number of records in the table Sales will be doubled. That can make this method inefficient, especially with large tables. Thus, we recommend using it only on small and medium tables. Create a Ranking Report with ROW_NUMBER In this section, we will use the ROW_NUMBER() function to create a ranking. We will see that there are better functions for ranking, like RANK and DENSE_RANK; however, we can build a pretty good ranking report by using ROW_NUMBER(). Let’s suppose that once a year our car sales company gives three bonuses to their salespeople: one bonus is for the person who sold the most units, another bonus is for the person who made the most revenue, and the third bonus is for the person who made the most profit. If any bonus category is won by two sales reps, then both reps receive 50% of the bonus. We will use the table Sellers_2019 to obtain the rankings and define the winner of each bonus. Seller_nameUnits_soldRevenueProfit John Doyle123834.00038% Mary Smith121914.00039% Susan Graue123874.00039% Simon Doe117824.00042% Henry Savosky120813.00035% Table: Sellers_2019 The following query will return the rankings we need to define the bonus winners. These rankings will be on the columns units_ranking, revenue_ranking, and profit_ranking. SELECT seller_name, ROW_NUMBER() OVER (ORDER BY units_sold desc) units_ranking, ROW_NUMBER() OVER (ORDER BY revenue desc) revenue_ranking, ROW_NUMBER() OVER (ORDER BY profit desc) profit_ranking FROM sellers; The next image shows the results of the previous query with the ROW_NUMBER() function. Initially, the ranking values seem to be correct. But if we review them in detail, we can find a mistake with Susan Graue in Units_ranking. Susan (and John Doyle) won the units sold bonus with 123 units. However, the ROW_NUMBER function assigns position 1 to John and position 2 to Susan, which is not correct. Seller_nameUnits_rankingRevenue_rankingProfit_ranking John Doyle134 Mary Smith312 Susan Graue223 Simon Doe541 Henry Savosky455 Result table So, in this case the ROW_NUMBER() function is not the best choice for ranking calculations. Fortunately, SQL provides two functions specifically for ranking purposes: RANK() and DENSE_RANK(). The next query uses the RANK() function instead of the ROW_NUMBER(): SELECT seller_name, RANK() OVER (ORDER BY units_sold desc) units_ranking, RANK() OVER (ORDER BY revenue desc) revenue_ranking, RANK() OVER (ORDER BY profit desc) profit_ranking FROM sellers; In the next image, we can see the results of the RANK() query. We can verify that the issue with Susan’s rank is solved. Seller_nameUnits_rankingRevenue_rankingProfit_ranking John Doyle134 Mary Smith312 Susan Graue122 Simon Doe541 Henry Savosky455 Result table If you are interested in the RANK() and DENSE_RANK() window functions, I suggest you read the article How to use RANK functions. It will give you several examples and queries. Tags: sql learn sql window functions You may also like SQL Window Function Example With Explanations Interested in how SQL window functions work? Scroll down to see our SQL window function example with definitive explanations! Read more When Do I Use SQL Window Functions? SQL window functions can help you quickly and accurately create useful reports and analyses. Learn more with real-world business examples. Read more SQL Window Functions Cheat Sheet This 2-page SQL Window Functions Cheat Sheet covers the syntax of window functions and a list of window functions. Download it in PDF or PNG format. Read more Overview of Ranking Functions in SQL Do you need to rank rows in SQL? Learn about the SQL ranking functions, their syntax, and the differences between them, and see real-world examples. Read more Why Should I Learn SQL Window Functions? Want to learn what SQL window functions are, when you can use them, and why they are useful? This article is intended just for you. Read more SQL Course of the Month – Window Functions Find out why you should learn SQL window functions in April and why you should do it in our course. Read more What Is a SQL Running Total and How Do You Compute It? In SQL, the running total is a very common pattern. It’s frequently used in finance and data analysis. Find out what a SQL running total is and how to compute this cumulative sum with window functions. Read more What Is the Difference Between a GROUP BY and a PARTITION BY? What is the difference between a GROUP BY and a PARTITION BY in SQL queries? When should you use which? You can find the answers in today's article. Read more How to Analyze Time Series COVID-19 Data with SQL Window Functions Discover how to analyze COVID-19 time series data with the help of SQL window functions Read more SQL Window Functions vs. GROUP BY: What’s the Difference? Window functions and GROUP BY may seem similar at first, but they’re quite different. Learn how window functions differ from GROUP BY and aggregate functions. Read more Why Window Functions Are Not Allowed in WHERE Clauses Window functions can only appear in SELECT and ORDER BY but not in WHERE clauses. The reason is the logical order in which SQL queries are processed. Read more How to Rank Rows in SQL: A Complete Guide Here’s what you need to know about SQL RANK all in one place. Learn about RANK functions with explanations, examples, and common use cases. Read more SQL Window Functions By Explanation Window functions in SQL operate on a set of table rows and return a single aggregated value for each of the rows. Read more Common SQL Window Functions: Positional Functions Positional SQL window functions deal with data's location in the set. In this post, we explain LEAD, LAG, and other positional functions. Read more Common SQL Window Functions: Using Partitions With Ranking Functions Once you’ve learned such window functions as RANK or NTILE, it’s time to master using SQL partitions with ranking functions. Read more How to Use Rank Functions in SQL In this article, you’ll learn how to use rank functions in SQL. It’ll give you a solid foundation for getting deeper into SQL window functions. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.