Back to cookbooks list Articles Cookbook

How to Eliminate Duplicate Rows in SQL

  • DISTINCT

Problem:

You’d like to eliminate any duplicate rows from the result set of a query so that each row appears only once.

Example:

Our database has a table named clothes with data in the following columns: id, name, color, and year_produced.

idnamecoloryear_produced
1T-shirtyellow2015
2jacketblue2016
3jeansblack2015
4jacketblue2015
5jacketgreen2016
6jacketyellow2017
7hatyellow2017

Let’s get the non-repeated names and colors of clothes produced before 2017.

Solution:

SELECT DISTINCT name, color
FROM clothes
WHERE year_produced < 2017;

Here’s the result:

namecolor
T-shirtyellow
jacketblue
jeansblack
jacketgreen

Notice that the blue appears only once in this result set, even though it appears twice in the original table. This is because we specified that we want to select only those rows for which the name-color combination is unique.

Discussion:

Simply use the DISTINCT keyword after SELECT if you want to select only non-repeated rows. This keyword forces the query to discard any duplicate rows, based only on the columns you listed.

Here’s an example of selecting only the rows for which the name of the item is unique:

SELECT DISTINCT name
FROM clothes;

Unlike the previous query, this one returns only three records:

name
T-shirt
jacket
jeans

While there are four jeans in the original table (two blue, one green, and one yellow), the item appears only once in this result set thanks to the DISTINCT keyword.

Recommended courses:

Recommended articles:

See also: