Back to articles list Articles Cookbook
2 minutes read

How to Select the First Row in a Group?

Often you want to select a single row from each GROUP BY group. PostgreSQL has a statement especially for that: SELECT DISTINCT ON.

Let's say I want to select one weather report for each location.

location time report
Ottawa 2014-05-15 8:00 sunny
2014-05-15 11:00 cloudy
2014-05-15 15:00 rainy
Warsaw 2014-05-15 8:00 overcast
2014-05-15 11:00 sunny
2014-05-15 15:00 rainy

SELECT DISTINCT ON (location) location, time, report
FROM weather_reports;

The query retrieves one weather report for each location.

You can use several expressions in SELECT DISTINCT ON statement.

SELECT DISTINCT ON (creation_date_week, resource_id) 
  id,
  creation_date, 
  date_trunc('week', creation_date) creation_date_week,
  resource_id
FROM backup_data;

For each resource the query retrieves a single backup data for each week.

SELECT DISTINCT ON with ORDER BY

The undecorated SELECT DISTINCT ON selects one row for each group but you don't know which of the rows will be selected. Each execution of the query can return different rows. Use the ORDER BY clause if you want to select a specific row.

SELECT DISTINCT ON (location) location, time, report
FROM weather_reports
ORDER BY location, time DESC;

The query retrieves the most recent weather report for each location.

SELECT DISTINCT ON (creation_date_week, resource_id)
  id,
  creation_date, 
  date_trunc('week', creation_date) creation_date_week,
  resource_id
FROM backup_data
ORDER BY creation_date_week, resource_id, creation_date DESC;

For each resource the query selects the newest backup data in each week. The SELECT DISTINCT ON expressions have to repeat at the beginning of the ORDER BY clause. You can add additional expression after them.

SELECT DISTINCT

A related clause is the standard SELECT DISTINCT clause. The standard clause eliminates duplicate rows. SELECT DISTINCT ON eliminates rows duplicating the given expressions.