Back to articles list Articles Cookbook
10 minutes read

How to Join the Same Table Twice

JOIN is one of the most common statements in SQL. As you may know, it is used to join and combine data from two or more tables into one common data set. In this article, I’m going to discuss special types of joins? in which you combine the same table twice—including joining a table to itself, also known as the self join. When and why do you need to do this? How do you write it in SQL? Let’s find out.


Joins: A Quick Review

You are probably familiar with the joins in SQL. You have two tables, A and B, and you combine them by using a column common to both. Here is an example:


We have two tables: customer and city, with a common column named city_id.

Now, if you want to join them together to get the customers’ respective city names, you can do so with a join like this:

select customer.customer_id,
     customer.city_id, as city_name
from customer 
join city 
on customer.city_id = city.city_id;

In this JOIN statement, we match the records from customer and city by a key (city_id). We retrieve all 6 columns from the customer table and one column, name, from the city table. There are several types of joins in SQL; this example does an INNER JOIN.

I'm not going to dive deep into the JOIN syntax here. To learn more, check out our interactive SQL JOINs course. It contains 93 exercises designed to refresh SQL JOIN syntax and practice different JOIN types on real world problems.

The result of this join will be one table with all 6 fields from the customer table, plus an additional field from the city table:


Joining the Same Table Multiple Times

Now that we have done a quick review, let's look at more complex joins.

Sometimes you need to join the same table multiple times. Generally, this involves adding one or more columns to a result set from the same table but to different records or by different columns. We will examine two such scenarios: joining a table to itself and joining tables with multiple relationships.

Self Join: Joining a Table to Itself

A self join is a special case of the join. Instead of joining two different tables, you join one table to itself. Why would we want to do this?

In our example above, we wanted to add a column from the city table, the city name, to the customer table. So, we joined two different tables to each other. Doing a self join would mean, for instance, joining the customer table to itself.

Here’s the customer table as a reminder:


The spouse_id column stores the customer_id of the customer’s spouse. For example, Customers 1 and 2 (John and Mary) are spouses of each other, Customers 3 and 5 (Lisa and Tim) are spouses of each other, and so on. We can add the first name and the last name of the spouse to each record in the customer table. To do this, we need to perform a self join, that is, join the customer table to itself:

select cust.customer_id,
      spouse.firstname as spouse_firstname,
      spouse.lastname as spouse_lastname
from customer cust 
join customer spouse
on cust.spouse_id = spouse.customer_id;

When you run this code, the result is the following:


Now that you’ve seen an example use case for self joins, let's review its SQL syntax.

Self Join Syntax

The syntax for the self join is very similar to any other type of joins. Here is the code from our self join example:

select cust.customer_id,
      spouse.firstname as spouse_firstname,
      spouse.lastname as spouse_lastname
from customer cust
join customer spouse
on cust.spouse_id = spouse.customer_id;

It is a JOIN statement in which the customer table is used twice.

The information about the spouse, such as his or her name, is stored in the same table as a separate customer with his or her own customer_id. Since spouse_id contains the customer_id of the spouse, we need to join the table with itself to get the name of the spouse.

You can think of a self join as a join between two copies of the same table. For each record with a non-null value in spouse_id, we search for the value of customer_id that matches it. When we find a match, the columns firstname and lastname are added to the resulting table.

Table aliases are required in a self join. The code does not work without them, since it would not know which copy of the table you are referring to. Here, I’m using the aliases cust and spouse.

Right before the FROM keyword, we choose columns we want to keep in the resulting table. We need to use the table aliases for column retrieval (cust.firstname, cust.lastname, spouse.firstname, etc.). We keep five columns from the customer table and append from the same table two columns that contain the name of the spouse.

This is an inner join, but you can use any type of join: LEFT JOIN, RIGHT JOIN, CROSS JOIN, etc. In our example, we have an inner join which returns only the matched records; that is, only the customers with spouses are returned. Donna doesn't have a spouse in the table, so Donna is not included in the resulting data set.

If you want to learn more about joins, I recommend our article “How to Learn SQL JOINs.” There is also an interactive SQL JOINs course available on our platform.

Examples of the Self Join

The above is just one example. Once you start to use SQL on a daily basis, you come across the need for self joins quite often.

A common use case for the self join is when there is a hierarchy among the records in a table. This type of data structure is called a tree structure, and you often need to join the table with itself in SQL. Here are some examples.

Example 1: Employee Hierarchy

Each employee has a manager, and a manager in turn has his or her manager, all in the same table. If you want to add the corresponding manager information to each record, you need to do a self join. We cover this example in the article “An Illustrated Guide to the SQL Self Join,” so take a look at it to see what this looks like.

Example 2: Department Hierarchy

Each department within an organization has a parent: for example, the Data Science department is under the IT department, the IT department is under Business Support, and Business Support is under the board.

Consider the following table, department, which looks like this:

1Board of directors.
3Control and risk1
5Corporate credit2
6Retail banking2
8Risk management3
10Internal audit3
13General services4
14Human resources4

Now, if you want to append the parent name to each department, you need to write a self join:

select c.*, as parent_name
from department c 
left join department p 
on c.parent_department_id=p.department_id;

Inside this SELECT, we join the department table with itself to get the parent department name as an additional field. Note that the record whose department_id is 1 doesn't have a parent (parent_department_id is NULL; it is not populated). This is because the board of directors are at the top of the tree structure. We want to display this record in the result, so we use a LEFT JOIN and not an INNER JOIN.

When you run this code, the resulting table looks like this:

1Board of directors..
2Operations1Board of directors
3Control and risk1Board of directors
4Administration1Board of directors
5Corporate credit2Operations
6Retail banking2Operations
8Risk management3Control and risk
9Finance3Control and risk
10Internal audit3Control and risk
13General services4Administration
14Human resources4Administration

You can easily see the parent to which each department belongs: IT is under Administration, Administration is under the Board, etc.

Example 3: Category Hierarchy

Take classified advertising—those popular web pages where you can rent, buy, or sell anything from real estate to miscellaneous products and services. To place an ad, you pick a category and a subcategory for your ad. For example, if you sell real estate, you would choose among subcategories like house, apartment, or land.

We have a table named category which contains information about these categories and subcategories as well as their relationships. The relationships among the categories and subcategories in this table are stored in a parent-child structure like this:

1Real estate.
7Personal cars5

All categories and subcategories are in this one table. Now, If you want to add information about the parent to each record, you will need to do a self join—join this table to itself:

select subcategory.*,
    main.category_name as parent_name
from category subcategory 
left join category main 
on subcategory.parent_category_id = main.category_id;

Here is the result from executing this SQL statement:

1Real Estate..
2Apartments1Real Estate
3Houses1Real Estate
4Offices1Real Estate
7Personal cars5Cars

Multiple Relationships Between two Tables

There are situations beside the self join in which you need to join the same table more than once. One is when you have multiple relationships between two different tables. This is where you join the same table twice but usually to some other table and not necessarily to itself.

Suppose that the customer table has two fields that contain city IDs. This is common if you have two different cities for each customer?—?for example, the city of residence (residence_city_id) and the city of the mailing address where the notices should be sent (notice_city_id):


We also have city which has the city ID (city_id) and the name of the city (name), as seen earlier and shown below as a reminder:


Now, if you want to display the names of the cities, you will have to join the city table twice:

select cust.customer_id,
      cust.notice_city_id, as residence_city_name, as notice_city_name
from customer cust 
join city residence_city 
on cust.residence_city_id=residence_city.city_id
join city notice_city 
on cust.notice_city_id=notice_city.city_id;

Let’s break down what is happening in this code. First, we join customer and city with residence_city_id as the key. We get residence_city_name by matching it to city_id in the city table. A second join is performed between customer and city to get notice_city_name. The key used here is notice_city_id which also matches to city_id in the city table.

We use table aliases cust for customer, residence_city for the first copy of city to get the residence city name, and notice_city for the second copy of city to get the notice city name. We use the aliases to define the columns in the resulting table. Aliases are also used during the join to define the key columns. Again, aliases are required in order to distinguish the two copies of city.

When you run this code, you get the following result:


We now have two additional columns with the corresponding city names.

We use simple (i.e., inner) joins here, but you can use any type of join as needed. If you are new to SQL joins and want to read more about their different types, I recommend the articles “How to learn joins” and “How to practice joins” which cover these topics. If you prefer to learn by watching videos. I highly recommend the episode that discusses joins.

Joining the Same Table Twice

In this article, we discussed when you need to join the same table twice in SQL and saw some common business use cases. We explained how to do so and what the SQL syntax looks like. Self joins with hierarchical data and multiple relationships between two tables are just two of the situations for which you need to join the same table twice. There are others; generally, they involve adding one or more columns to a result set from the same table in the same column.

If you want to learn more about joins, including self joins, I recommend our SQL JOINs interactive course available on our platform. When it comes to SQL, it is important to practice; our course is designed for just that!