Back to articles list Articles Cookbook
9 minutes read

What Is a Recursive CTE in SQL?

The article that’ll show you practical examples of using recursive CTEs in SQL.

If you’ve heard about SQL’s recursive CTEs but never used them, this article is for you. It’s also for you if you never get tired of recursive CTE examples.

Before we dig into recursion, I’ll remind you what CTEs are and what their syntax is. Then I’ll do the same for recursive CTEs. After that, I’ll show you how recursive CTEs work in three examples.

What are CTEs?

The CTE (common table expression), also known as the WITH clause, is an SQL feature that returns a temporary data set that can be used by another query. As it’s a temporary result, it’s not stored anywhere, but it still can be referenced like you would reference any other table.

There are two types of CTEs, non-recursive and recursive.

Here’s a nice article that will show you what CTEs are and how they work.

Non-Recursive CTE Syntax

The general syntax of a non-recursive CTE looks like this:

WITH cte_name AS (cte_query_definition)

SELECT *
FROM   cte_name;

The first part of the syntax is the CTE. It begins with the keyword WITH. Then you give your CTE a name. After you follow that by the AS keyword, you can define CTE in the parentheses.

The second part of the syntax is a simple SELECT statement. It is written immediately after the recursive CTE, without any commas, semicolons, or similar marks. Like I said earlier, the CTE is used in another query just like any other table. This is exactly what the SELECT statement does.

Here’s the article that can additionally help you with the CTE syntax and its rules. And if you need some more CTE examples, this article is for you.

Recursive CTE Syntax

A recursive CTE references itself. It returns the result subset, then it repeatedly (recursively) references itself, and stops when it returns all the results.

The syntax for a recursive CTE is not too different from that of a non-recursive CTE:

WITH RECURSIVE cte_name AS (
    cte_query_definition (the anchor member)

    UNION ALL

    cte_query_definition (the recursive member)
)


SELECT *
FROM   cte_name;

Again, at the beginning of your CTE is the WITH clause. However, if you want your CTE to be recursive, then after WITH you write the RECURSIVE keyword. Then it’s business as usual: AS is followed by the parentheses with the CTE query definition. This first query definition is called the anchor member.

To connect the anchor member with the recursive member, you need to use the UNION or UNION ALL command. The recursive member is, obviously, the recursive part of CTE that will reference the CTE itself. You’ll see how it works in an example very soon. 

Recursive CTEs are used primarily when you want to query hierarchical data or graphs. This could be a company’s organizational structure, a family tree, a restaurant menu, or various routes between cities. See these articles to understand how CTEs work with hierarchical structures and how to query graph data.

Now that we understand how recursive CTEs work, let’s look at some examples.

Example 1 – Finding Bosses and Hierarchical Level for All Employees

For this problem, I’ll use data from the table employees, which has the following columns:

  • id: The employee’s ID.
  • first_name: The employee’s first name.
  • last_name: The employee’s last name.
  • boss_id: The employee boss’s ID.

Here’s what the data looks like:

idfirst_namelast_nameboss_id
1DomenicLeaver5
2ClevelandHewins1
3KakalinaAtherton8
4RoxannaFairlieNULL
5HermieComsty4
6PoohGoss8
7FaulknerChalliss5
8BobbeBlakeway4
9LaureneBurchill1
10AugustaGosdin8

It’s not too complicated. For example, Domenic Leaver’s boss is the employee with the ID of 5; that’s Hermie Comsty. The same principle works for all other employees except Roxanna Fairlie. She has no boss; there is a NULL value in the column boss_id. We can conclude that Roxanna is the president or owner of the company.

Let’s now write the recursive CTE to list all employees and their direct bosses.

WITH RECURSIVE company_hierarchy AS (
  SELECT	id,
    		first_name,
    		last_name,
    		boss_id,
		0 AS hierarchy_level
  FROM employees
  WHERE boss_id IS NULL

  UNION ALL 
  
  SELECT	e.id,
    		e.first_name,
    		e.last_name,
    		e.boss_id, 
		hierarchy_level + 1
  FROM employees e, company_hierarchy ch
  WHERE e.boss_id = ch.id
)

SELECT   ch.first_name AS employee_first_name,
	   ch.last_name AS employee_last_name,
	   e.first_name AS boss_first_name,
	   e.last_name AS boss_last_name,
	   hierarchy_level
FROM company_hierarchy ch
LEFT JOIN employees e
ON ch.boss_id = e.id
ORDER BY ch.hierarchy_level, ch.boss_id;

What does this query do? It’s a recursive query, so it starts with WITH RECURSIVE. The name of the CTE is company_hierarchy. After AS, the CTE definition is in the parentheses.

The first SELECT statement selects all the employee table columns where the column boss_id is NULL. In short, it will select Roxanna Fairlie, because only she has a NULL value in that column. Even shorter: I’m starting the recursion from the top of the organizational structure. There’s also a column hierarchy_level with the value of 0. That means the owner/president’s level is 0 – they're on top of the hierarchy.

I’ve used the UNION ALL to connect this SELECT statement with the second one, i.e. with the recursive member. In the recursive member, I’m selecting all the columns from the table employees and the CTE company_hierarchy where the column boss_id is equal to the column id. Notice the part  hierarchy_level + 1. This means that with every recursion, the CTE will add 1 to the previous hierarchy level, and it will do that until it reaches the end of the hierarchy. Also note that I’m treating this CTE as any other table. To finish defining the CTE, simply close the parentheses.

Finally, there’s a third SELECT statement, outside of the CTE. It selects the columns that will show employees, their bosses’ names, and the hierarchy level. Data is taken from the CTE and the table employees. I’ve joined those two with a LEFT JOIN, since I want all the data from the CTE – including Roxanna Fairlie, who has the NULL value in the column boss_id.  The result will be shown in ascending order: first by the hierarchy level, then by the boss’s ID. Here’s how it looks:

employee_first_nameemployee_last_nameboss_first_nameboss_last_namehierarchy_level
RoxannaFairlieNULLNULL0
HermieComstyRoxannaFairlie1
BobbeBlakewayRoxannaFairlie1
DomenicLeaverHermieComsty2
FaulknerChallissHermieComsty2
AugustaGosdinBobbeBlakeway2
PoohGossBobbeBlakeway2
KakalinaAthertonBobbeBlakeway2
LaureneBurchillDomenicLeaver3
ClevelandHewinsDomenicLeaver3

Roxanna Fairlie is the ultimate boss; you already knew that. There are two employees at level 1. This means Bobbe Blakeway and Hermie Comsty are the direct subordinates of Roxanna Fairlie. At level 2, there are employees whose direct bosses are Bobbe Blakeway and Hermie Comsty. There’s also a third level in the hierarchy. Those are employees whose immediate boss is Domenic Leaver.

Example 2 – Finding the Investment Amount by Investor

In this example, I’ll use the table investment:

  • id: The investment’s ID.
  • investment_amount: The investment’s amount.

The data in the table looks like this:

idinvestment_amount
19,705,321.00
25,612,948.60
35,322,146.00

These are the amounts of the three possible investment options. They will be considered by the three investors, who will divide the total investment amount in equal shares. Your task is to calculate the amount per investor depending on their number, i.e. if one, two, three, or no investors invest in each investment.

The query that solves this problem is:

WITH RECURSIVE per_investor_amount AS (
	SELECT	0 AS investors_number,
			0.00 AS investment_amount,
			0.00 AS individual_amount
	UNION 

	SELECT	investors_number + 1,
			i.investment_amount,
			i.investment_amount / (investors_number + 1)
	FROM investment i, per_investor_amount pia
	WHERE investors_number << 3
)

SELECT *
FROM per_investor_amount
ORDER BY  investment_amount, investors_number;

Once again, the CTE starts with WITH RECURSIVE, followed by its name and the query definition. This time, I’ll use the anchor member of the recursive query to create some data. The columns are investors_number, investment_amount, and individual_amount. This is the point I want the recursion to start from (the same way it was in the previous example, with hierarchy_level = 0).

Then comes the UNION and the recursive member. This part of the query will increase the investors_number column by one with every recursion. It will do that for every investment_amount. The third column will calculate the amount of that investment per investor, depending on the number of investors participating. The recursion will be performed for up to three investors (i.e. until it reaches the condition WHERE investors_number < 3).

After that comes the simple SELECT statement that will return all the columns from the CTE. And here’s the result:

investors_numberinvestment_amountindividual_amount
00.000.00
15,322,146.005,322,146.00
25,322,146.002,661,073.00
35,322,146.001,774,048.67
15,612,948.605,612,948.60
25,612,948.602,806,474.30
35,612,948.601,870,982.87
19,705,321.009,705,321.00
29,705,321.004,852,660.50
39,705,321.003,235,107.00

It’s not difficult to analyze. If there are no investors, the investment amount is zero, and so is the individual amount. If the investment is 5,322,146.00 and there’s only one investor, then the amount per investor will be 5,322,146.00. If there are two investors in the same amount, then each of them will have to pay 2,661,073.00. If all three investors decide to invest, each will pay 1,774,048.67. The other two investment amounts follow the same pattern, as you can see in the table.

Example 3 – Finding Routes Between Cities

In the third example, I’ll be using the table cities_route, which contains data about Dutch cities:

  • city_from: The departure city.
  • city_to: The destination city.
  • distance: The distance between two cities, in kilometers.
city_fromcity_todistance
GroningenHeerenveen61.4
GroningenHarlingen91.6
HarlingenWieringerwerf52.3
WieringerwerfHoorn26.5
HoornAmsterdam46.1
AmsterdamHaarlem30
HeerenveenLelystad74
LelystadAmsterdam57.2

Use this table to find all the possible routes from Groningen to Haarlem, showing the cities on the route and the total distance.

Here’s the query to solve this problem:

WITH RECURSIVE possible_route AS (
	SELECT	cr.city_to,
       		cr.city_from || '->' ||cr.city_to AS route,
       		cr.distance
      FROM cities_route cr
      WHERE cr.city_from = 'Groningen'

UNION ALL

SELECT 	cr.city_to,
       		pr.route || '->' || cr.city_to AS route,
        		CAST((pr.distance + cr.distance) AS DECIMAL(10, 2))
      FROM possible_route pr
INNER JOIN cities_route cr
      		ON cr.city_from = pr.city_to
)

SELECT 	pr.route,
		pr.distance
FROM possible_route pr
WHERE pr.city_to = 'Haarlem'
ORDER BY pr.distance;

Let’s see what this query does. The first SELECT statement in the CTE definition will select the columns from the table cities_route where the departure city is Groningen. Notice there’s also a new column called route, which I’ll use to concatenate the cities on the route.

The UNION ALL connects this with the recursive member. This SELECT statement will select the arrival city, concatenate the cities on the route, and finally add the distances between these cities to the total of the route between Groningen and Haarlem. To achieve all that, I’ve joined the CTE with the table cities_route.

Then comes the SELECT statement that pulls data from the CTE. It will select the route and the distance where the arrival city is Haarlem, with the data being ordered by distance in ascending order.

The query result looks like this:

routedistance
Groningen->Heerenveen->Lelystad->Amsterdam->Haarlem222.6
Groningen->Harlingen->Wieringerwerf->Hoorn->Amsterdam->Haarlem246.5

It’s not difficult to understand this table. There are two routes from Groningen to Haarlem. They include different cities in between and they are 222.6 km and 246.5 km long, respectively.

If you want to keep learning, check out how you can use a recursive CTE instead of a long SQL query. And after tackling that subject,have a little fun by drawing something using a recursive CTE.

Continue Practicing Recursive CTEs

These three examples have demonstrated the possibilities of recursive CTEs in SQL. Now it’s time to build on what you learned here.

Probably the best option is having a go at our Recursive Queries course. It offers you plenty of examples, explanations, and practice opportunities. The course is part of the Advanced SQL course track, where you can learn about other advanced SQL topics like window functions, GROUP BY extensions, and recursive queries. Our platform offers many ways to practice these topics.

Have fun!