# How to Select the First Row in Each GROUP BY Group

## Problem:

You've grouped your data with `GROUP BY` and would like to display only the first row from each group.

## Example:

Our database has a table named `exam_results` with data in the following table:

first_namelast_nameyearresult
JohnKlein202040
EdithBlack202043
MarkJohnson201932
LauraSummer202035
KateSmith201941
JacobBlack201944
TomBennett202038
EmilyKelly202043

For each year, let's find the student with the best `result`. If there are two students tied for the best in a group, we'll arbitrarily select one of them to display.

## Solution:

```WITH added_row_number AS (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY year ORDER BY result DESC) AS row_number
FROM exam_results
)
SELECT
*
WHERE row_number = 1;
```

The result is:

first_namelast_nameyearresultrow_number
JacobBlack2019441
EmilyKelly2020431

## Discussion:

First, you need to write a CTE in which you assign a number to each row within each group. To do that, you can use the `ROW_NUMBER()` function. In `OVER()`, you specify the groups into which the rows should be divided (`PARTITION BY`) and the order in which the numbers should be assigned to the rows (`ORDER BY`).

Take a look at the result of the inner query:

```SELECT
*,
ROW_NUMBER() OVER(PARTITION BY year ORDER BY result DESC) AS row_number
FROM exam_results;
```
first_namelast_nameyearresultrow_number
JacobBlack2019441
KateSmith2019412
MarkJohnson2019323
EmilyKelly2020431
EdithBlack2020432
JohnKlein2020403
TomBennett2020384
LauraSummer2020355

You assign the row numbers within each group (i.e., year). Each row has a row number based on the value of the `result` column. The rows are sorted in the descending order because of the `DESC` keyword after `ORDER BY result`. Even if there are multiple rows within a group that have the same value of `result`, the rows are still given different numbers. Here, Edith Black and Emily Kelly have the same `result` but different row numbers. To change this behavior and assign the same row number for the same result within a group, use `RANK()` or `DENSE_RANK()` instead of `ROW_NUMBER()`.

In the outer query, you select all data from the CTE (`added_row_number`) and use a `WHERE` condition to specify which row to display from each group. Here, we want to display the first row, so the condition is `row_number = 1`.

Note that you can easily modify the solution to get, for example, the second row of each group.

```WITH added_row_number AS (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY year ORDER BY result DESC) AS row_number
FROM exam_results
)
SELECT
*
WHERE row_number = 2;
```

Here's the result:

first_namelast_nameyearresultrow_number
KateSmith2019412
EdithBlack2020432

On the other hand, if you want to get the row(s) with the second highest value of `result` within each group, you should use the `DENSE_RANK()` function. While the `ROW_NUMBER()` function creates consecutive numbers for each row in a group, resulting in different values assigned to the rows with the same result, the `DENSE_RANK()` function gives the same number to the rows with the same result.

```WITH added_dense_rank AS (
SELECT
*,
DENSE_RANK() OVER(PARTITION BY year ORDER BY result DESC) AS rank
FROM exam_results
)
SELECT
*
You can see that John Klein has the second highest value of `result (40)` for the year 2020. John Klein is actually the third person in the group, but the first two students have the same `result` and they both have `rank = 1`.