# How to Number Rows in an SQL Result Set

Have you ever needed to add a sequential number to the records in the result of an SQL query? It’s not as simple as it looks! Find out how to do it correctly in this article.

To number rows in a result set, you have to use an SQL window function called `ROW_NUMBER()`. This function assigns a sequential integer number to each result row. However, it can also be used to number records in different ways, such as by subsets. You can even use it to number records for other interesting purposes, as we will see.

To learn how window functions work, what functions there are, and how to apply them to real-world problems, it’s best to take the Window Functions course. You can find all about it here. It’s interactive, there are 218 exercises, and you only need a web browser and some basic SQL knowledge.

## A Frequent Request: Could You Number the Records?

Suppose you work for a car sales company and you want to produce the following report. Note that the first column (`row_num`), is not a table column; we generate it using `ROW_NUMBER()` in the query.

row_numArticle_codeArticle_nameBranchUnits_sold
1101Katan 2.3 LuxNew York23
2102Katan 1.8 StdNew York17
3102Katan 1.8 StdSan Francisco18
4101Katan 2.3 LuxSan Francisco15
5103Katan GoldNew York3

Result table

The query to obtain the report is:

```SELECT ROW_NUMBER() OVER () AS row_num,
article_code,
article_name,
branch,
units_sold
FROM  Sales
WHERE article_code IN ( 101, 102, 103 )
```

In the above query, the syntax of the `ROW_NUMBER()` function is very simple: we use an empty `OVER` clause. This means that we want to number all the records in the result set using just one sequence of numbers, assigning numbers to records without any order. This is the simplest way to use the `ROW_NUMBER()` function:

`ROW_NUMBER() OVER () AS row_num`

However, there are other ways to use `ROW_NUMBER()`. We can add a `PARTITION BY` and/or an `ORDER BY` clause to the `OVER`, as we’ll see in the next section. The `PARTITION BY` clause allows us to number several groups of records independently, while the `ORDER BY` clause allows us to number the records in specific order. In the next section, we will see some examples.

Before continuing with `ROW_NUMBER()`, we should say a few words about SQL window functions. As we mentioned previously, `ROW_NUMBER()` is a window function. There are many other window functions that you can use in your queries, like `AVG()`, `MAX()`, `LEAD()`, `LAG()` and `FIRST_VALUE()`. If you want to go into details, I suggest LearnSQL’s Window Functions course. It’s a step-by-step tutorial that takes you through SQL window functions using examples and exercises.

## Going Deeper: The Partition By and Order By Clauses

In the previous section, we covered the simplest way to use the `ROW_NUMBER()` window function, i.e. just numbering all records in the result set in no particular order. In the next paragraphs, we will see three examples with some additional clauses, like `PARTITION BY` and `ORDER BY`.

In our first example, we will number the records using a different sequence for each company branch, which will be ordered by the units sold at that branch. In the next query, the `PARTITION BY` branch clause groups the records that have the same value in branch, assigning a different `ROW_NUMBER` sequence to each group/branch. (Each group has a different color in the image below.) The `ORDER BY units_sold` clause defines the order in which we process the rows inside the partition. In this case, the rows belonging to each partition will be ordered by unit_sold in descending order.

```SELECT
ROW_NUMBER() OVER (PARTITION BY branch ORDER BY units_sold DESC) AS row_num,
article_code,
article_name,
branch,
units_sold
FROM  Sales
WHERE article_code IN ( 101, 102, 103 )
```

Note that in this query, sequences are assigned per branch – in the image below, each group of records has a different color – and ordered by `units_sold`. The clause we used is:

`ROW_NUMBER() OVER (PARTITION BY branch ORDER BY units_sold DESC) `

This clause allows us to number groups of records that have the same value in the branch column. In our example, there are two groups of records: New York (red) and San Francisco (blue). Now each group of records will be numbered (the `ORDER BY` clause) based on the `units_sold` column. The values are shown in descending order.

row_numArticle_codeArticle_nameBranchUnits_sold
1101Katan 2.3 LuxNew York23
2102Katan 1.8 StdNew York17
3103Katan GoldNew York3
1102Katan 1.8 StdSan Francisco18
2101Katan 2.3 LuxSan Francisco15

Result table

The `OVER`, `PARTITION BY`, and `ORDER BY` clauses are very common in window functions; if you want to go into details, I suggest you read the article Window Function Examples, where you will find several window functions examples explained in detail.

## Using ROW_NUMBER to Remove Duplicates

Another interesting use case for the `ROW_NUMBER()` function is when we have full duplicated records in a table. Full duplicated records happen when the table has more than one record with the same values in all their columns (usually due to a previous fail). We will show some SQL code to fix this situation; moreover, this code can be adapted to any case of full duplicated records.

First of all, let’s insert some full duplicated records into the `Sales` table. Suppose we don’t have a primary key in the `Sales` table and an SQL developer mistakenly executes the following `INSERT` statement:

```INSERT INTO sales
SELECT * FROM sales WHERE branch = 'San Francisco';
```

After the `INSERT` execution, the table `Sales` looks like this. The last two rows are full duplicates:

Article_codeArticle_nameBranchUnits_soldPeriod
101Katan 2.3 LuxNew York23Q1-2020
102Katan 1.8 StdNew York17Q1-2020
102Katan 1.8 StdSan Francisco18Q1-2020
101Katan 2.3 LuxSan Francisco15Q1-2020
103Katan GoldNew York3Q1-2020
102Katan 1.8 StdSan Francisco18Q1-2020
101Katan 2.3 LuxSan Francisco15Q1-2020

Table: `Sales`

To remove the duplicate records, we will add a new column called `row_num` and we’ll populate it with the following `INSERT` that uses the `ROW_NUMBER()` function. Note that we `PARTITION BY` all columns in the table. Here is the SQL code:

```ALTER TABLE sales ADD COLUMN row_num INTEGER;
INSERT INTO sales
SELECT
article_code, article_name, branch, units_sold, period,
ROW_NUMBER() OVER (PARTITION BY article_code,article_name,branch, units_sold,period)
FROM sales ;
```

Then, after adding the new column and populating it with `ROW_NUMBER()`, our table looks like this:

Article_codeArticle_nameBranchUnits_soldPeriodrow_num
101Katan 2.3 LuxNew York23Q1-2020NULL
102Katan 1.8 StdNew York17Q1-2020NULL
102Katan 1.8 StdSan Francisco18Q1-2020NULL
101Katan 2.3 LuxSan Francisco15Q1-2020NULL
103Katan GoldNew York3Q1-2020NULL
102Katan 1.8 StdSan Francisco18Q1-2020NULL
101Katan 2.3 LuxSan Francisco15Q1-2020NULL
101Katan 2.3 LuxNew York23Q1-20201
102Katan 1.8 StdNew York17Q1-20201
102Katan 1.8 StdSan Francisco18Q1-20201
101Katan 1.8 LuxSan Francisco15Q1-20201
103Katan GoldNew York3Q1-20201
102Katan 1.8 StdSan Francisco18Q1-20202
101Katan 2.3 LuxSan Francisco15Q1-20202

Table: `Sales`

It’s easy to see we need to remove all records with a NULL or a 2 in the column `row_num`. Let’s do it with the `DELETE` command. After that, we need to remove the column `row_num`. Here’s the code:

```DELETE FROM sales WHERE row_rank IS NULL OR row_rank = 2;
ALTER TABLE sales DROP COLUMN row_rank;
```

After running the `DELETE` and `ALTER` statements, the table `Sales` is fixed, with no duplicate records.

There is a disadvantage with this approach that we must clarify. At some point in the process, the number of records in the table `Sales` will be doubled. That can make this method inefficient, especially with large tables. Thus, we recommend using it only on small and medium tables.

## Create a Ranking Report with ROW_NUMBER

In this section, we will use the `ROW_NUMBER()` function to create a ranking. We will see that there are better functions for ranking, like `RANK` and `DENSE_RANK`; however, we can build a pretty good ranking report by using `ROW_NUMBER()`.

Let’s suppose that once a year our car sales company gives three bonuses to their salespeople: one bonus is for the person who sold the most units, another bonus is for the person who made the most revenue, and the third bonus is for the person who made the most profit. If any bonus category is won by two sales reps, then both reps receive 50% of the bonus. We will use the table `Sellers_2019` to obtain the rankings and define the winner of each bonus.

Seller_nameUnits_soldRevenueProfit
John Doyle123834.00038%
Mary Smith121914.00039%
Susan Graue123874.00039%
Simon Doe117824.00042%
Henry Savosky120813.00035%

Table: `Sellers_2019`

The following query will return the rankings we need to define the bonus winners. These rankings will be on the columns units_ranking, revenue_ranking, and profit_ranking.

```SELECT  seller_name,
ROW_NUMBER() OVER (ORDER BY units_sold desc) units_ranking,
ROW_NUMBER() OVER (ORDER BY revenue desc) revenue_ranking,
ROW_NUMBER() OVER (ORDER BY profit desc) profit_ranking
FROM sellers;
```

The next image shows the results of the previous query with the `ROW_NUMBER()` function. Initially, the ranking values seem to be correct. But if we review them in detail, we can find a mistake with Susan Graue in Units_ranking. Susan (and John Doyle) won the units sold bonus with 123 units. However, the `ROW_NUMBER` function assigns position 1 to John and position 2 to Susan, which is not correct.

Seller_nameUnits_rankingRevenue_rankingProfit_ranking
John Doyle134
Mary Smith312
Susan Graue223
Simon Doe541
Henry Savosky455

Result table

So, in this case the `ROW_NUMBER()` function is not the best choice for ranking calculations. Fortunately, SQL provides two functions specifically for ranking purposes: `RANK()` and `DENSE_RANK()`. The next query uses the `RANK()` function instead of the `ROW_NUMBER()`:

```SELECT  seller_name,
RANK() OVER (ORDER BY units_sold desc) units_ranking,
RANK() OVER (ORDER BY revenue desc) revenue_ranking,
RANK() OVER (ORDER BY profit desc) profit_ranking
FROM sellers;
```

In the next image, we can see the results of the `RANK()` query. We can verify that the issue with Susan’s rank is solved.

Seller_nameUnits_rankingRevenue_rankingProfit_ranking
John Doyle134
Mary Smith312
Susan Graue122
Simon Doe541
Henry Savosky455

Result table

If you are interested in the `RANK()` and `DENSE_RANK()` window functions, I suggest you read the article How to use RANK functions. It will give you several examples and queries.