Back to articles list Articles Cookbook
Updated: 23rd Oct 2024 12 minutes read

How to Write a Recursive CTE in SQL Server

This guide will help you understand how to write and use recursive CTEs in SQL Server. Recursive CTEs are powerful tools for handling hierarchical data, and we'll break down their syntax and applications step by step.

SQL Server offers a lot of powerful tools for working with data, including Common Table Expressions (CTEs). A CTE is a temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement, often referred to as a CTE query. CTEs can break down long queries into smaller, more manageable pieces of logic and make them more readable.

SQL Server offers recursive Common Table Expressions. A recursive Common Table Expression (CTE) in SQL Server allows you to perform recursive queries on hierarchical or graph-based data structures, such as organizational charts, family trees, transportation networks, etc. Recursive queries are used to loop through relationships between the data elements.

In this article, we'll start with standard CTE syntax and then explore how to write recursive CTEs in SQL Server.

A great way to further your understanding of recursive CTEs is to take our course on Recursive Queries in MS SQL Server. It will teach you how to master the most challenging type of query and effectively organize long SQL queries.

Basics of CTEs in SQL Server

First, let's review the basics of CTEs. A CTE begins with the WITH clause, followed by the name of the CTE and the SELECT statement that defines it. The syntax of a CTE looks like this:

WITH cte_name AS (
   SELECT 
     column1,
     column2,
     …
   FROM table
   … 
) 
SELECT … 
FROM cte_name, …;

Let’s start at the top. The first thing we are doing is giving our CTE a name, cte_name.  This CTE selects some columns (column1, column2, ...) from table. The rest of the query (the outer query) can refer to the CTE and its columns as if the CTE were a regular table.

You can think of a CTE as a temporary view that you can reference in the outer query, as we’re doing in the above example.

CTEs eliminate the need for temporary tables and nested subqueries, streamlining the query definition process. Unlike derived tables, which can be cumbersome and less readable, CTEs provide a clearer and more manageable way to structure complex queries.

Let’s say you wanted to write a query that returns the purchases managed by the five employees with the highest salary in the Sales department. The tables we will be using in this example are called employees (includes the columns employee_id, employee_name, and salary) and purchase (contains the columns id, date, customer_id, and total_amount).

WITH top_5_sales_employees AS (
   SELECT TOP 5
     employee_id,
     employee_name 
   FROM employees
   WHERE department_id = ‘Sales’ 
   ORDER BY salary DESC
) 
SELECT 
  p.id AS purchase_id,
  p.date,
  p.total_amount,
  e.id AS employee_id
  e.employee_name 
FROM top_5_sales_employees e
JOIN purchase p
ON p.employee_id = e.id;

The CTE starts with the WITH keyword, followed by the name top_5_sales_employees. In the parentheses, we select the columns we want to include in the CTE and specify the conditions for the query in the WHERE clause. Finally, we use the SELECT statement to select appropriate columns from the CTE as if it were a regular table.

To read more about what CTEs are and how you write them, check out our article What Is a CTE in SQL Server? If you’re looking for more CTE examples, you’ll find them in this article.

How to Use Recursive CTEs in SQL Server

So, what is a recursive CTE in SQL Server? A recursive CTE is a type of CTE that references itself in the SELECT statement, creating a loop. Recursive CTEs are used to traverse hierarchical data structures, such as organizational charts or networks.

Let's say we have a table called employees with columns for the employee's name, department, and manager. This is shown in the table below, where each record includes an employee and who they report to within the organization.

idnamedepartmentmanager_idmanager_name
124John DoeIT135Jane Miller
135Jane MillerHR146Sarah Smith
146Sarah SmithCEONULLNULL

At a glance, it’s pretty easy to see who reports to who and what this organizational hierarchy looks like. However, if we had hundreds of employees, it would be much more difficult to make sense of the data.

We can use a recursive CTE to generate a hierarchical tree of the company's employees. To do this, we would run this query:

WITH employee_manager_cte AS (
  SELECT 
    id, 
    name,
    department,
    manager_id,
    manager_name,
    1 AS level 
  FROM employees 
  WHERE manager_id IS NULL 
  UNION ALL 
  SELECT 
    e.id, 
    e.name,
    e.department,
    e.manager_id,
    e.manager_name,
    level + 1 
  FROM employees e 
  INNER JOIN employee_manager_cte r 
	ON e.manager_id = r.id 
) 
SELECT * 
FROM employee_manager_cte;

Let's break down this query step by step.

First, we define the recursive CTE with the name employee_manager_cte. We select the columns we want to include in the query: id, namedepartment, manager_id, manager_name,  and level. The level column is used to track the depth of the tree. We’ll start with the level 1; as we progress through the loop, this number will increase.

recursive CTE in SQL Server

This section before UNION ALL is referred to as the anchor member. In the anchor member, we start our loop. In our example, we select all the employees whose manager is NULL. In our organization chart, this will be the employees at the very top. In this case, there is only one employee at this level: Sarah Smith, the CEO.

The piece after the UNION ALL is called the recursive member. In the recursive member, we add new rows to the rows that have already been computed. In our example, we join the employees table with the employee_manager_cte CTE on the manager_id column. This creates a loop that traverses the tree from the top down. We add 1 to the level column to track the depth of each node.

Finally, we select all the columns from the employee_manager_cte CTE.

When you execute this query, SQL Server first processes the anchor member, which selects Sarah Smith as the root of the tree. It then processes the recursive member, which joins Sarah Smith with her direct report (Jane Miller). It then joins Jane Miller with her direct report (John Doe), and John Doe with his direct report (none). Since there are no more rows to add to the result set, SQL Server stops processing the CTE and returns the final result.

Here's what the result set looks like:

idnamedepartmentmanager_idmanagerlevel
146Sarah SmithCEONULLNULL1
135Jane MillerHR146Sarah Smith2
124John DoeIT135Jane Miller3

Another great explanation article you can check out is What Is a Recursive CTE in SQL? There you will find more practical examples of recursive CTEs in SQL.

Tips for Writing Recursive CTE in SQL Server

1. Start with the anchor member

The anchor member is the starting point of the recursive CTE. It's the part of the query that defines the base case, or the first set of rows that will be returned. In our organization chart example, this is the highest level of management. Make sure the anchor member returns all the columns you need in the final result set.

2. Ensure the number of columns match

The recursive member and anchor members must have the same number of columns and the same data types corresponding columns: In a recursive CTE, the recursive member references the CTE itself, so it's important to ensure that the recursive member and anchor member have the same number of columns and the same data types in corresponding columns.

3. Always UNION ALL

In SQL Server you may only use UNION ALL to combine an anchor member and a recursive member: When combining the anchor member and recursive member, you must use UNION ALL, not UNION. UNION ALL preserves all rows, including duplicates, whereas UNION removes duplicates. UNION is not allowed in recursive queries in SQL Server.

4. Beware Infinite Loops!

Make sure to write a termination check in your query. A termination check is a condition that stops the recursive CTE from looping indefinitely. Without a termination check, the query will by default run a maximum of 100 recursions and then error out.

The termination check is typically included in the WHERE clause of the recursive member, and it specifies when the recursion should stop.

By following these tips, you can write efficient and effective recursive CTEs that help you solve complex hierarchical problems in SQL Server. Be sure to check out this article on CTE best practices for more details.

Recursive Common Table Expressions in SQL Server – More Examples

Example 1: Recursive CTE for a Transportation Network

Another example of when you might use a recursive CTE is in showing a transportation network. Suppose we have a table called routes that stores information about transportation routes between cities. It includes the source city, the destination city, and the distance between the cities. We want to write a query that returns all the cities that are reachable from a given starting city, along with the total distance to each city.

Our data lives in a table called routes:

source_citydestination_citydistance
New YorkBoston215
New YorkPhiladelphia95
PhiladelphiaWashington140
BostonChicago985
WashingtonAtlanta640
AtlantaMiami660

Here is the query:

WITH recursive_cte AS (
  SELECT 
    source_city, 
    destination_city, 
    distance,
    source_city AS visited_cities 
   FROM routes 
   WHERE source_city = ‘New York’
   
   UNION ALL 
   
   SELECT 
     r.source_city, 
     r.destination_city, 
     r.distance + rc.distance,
     rc.visited_cities + ‘,’ + r.destination_city
   FROM routes r 
   INNER JOIN recursive_cte rc 
	ON r.source_city = rc.destination_city 
   WHERE rc.distance < 2000
AND CHARINDEX(',' + r.destination_city + ',', ',' +                                 rc.visited_cities + ',') = 0
) 
SELECT 
  destination_city, 
  distance
FROM recursive_cte

This recursive CTE starts with the anchor member, which selects all the routes that start at New York. In the recursive member, we join the routes table with the recursive_cte CTE on the source_city column to find all the cities reachable from New York.

We have added a new column called visited_cities that stores the list of visited cities as a comma-separated string. We initialize this column in the anchor member of the CTE by setting it to the source city. In the recursive member, we concatenate the current city to the list of visited cities and check if the destination city has already been visited using the CHARINDEX function. If the destination city has not been visited, we add it to the list of visited cities and continue the recursion

We keep adding rows to the result set until there are no more cities to add or until we hit our termination check. The result set shows all the cities that are reachable from New York and the total distance to each city.

destination_citydistance
Boston215
Philadelphia95
Chicago1200
Washington235
Atlanta875
Miami1535

Example 2: Using Recursive CTE for Task Dependencies in a Project

Another scenario in which we could use a recursive CTE would be to understand task dependencies for a project.

Suppose we have a table called tasks that stores information about tasks in a project, including the task_id, the task_name, the ID of the task it depends on (depends_on_task_id), and the time_required to complete the task. We want to write a query that calculates the total time required to complete a given task, including all its dependent tasks.

The data is shown below:

task_idtask_namedepends_on_task_idtime_required
1DesignNULL5
2Development110
3Testing25
4Documentation13
5Deployment32
6MarketingNULL7

Let’s write a query that pulls the total time required for the development task and its dependencies.

WITH recursive_cte AS (
  SELECT 
    task_id, 
    task_name, 
    depends_on_task_id, 
    time_required
  FROM tasks 
  WHERE task_id = 2
  
  UNION ALL 

  SELECT 
    t.task_id, 
    t.task_name, 
    t.depends_on_task_id, 
    t.time_required + rc.time_required
  FROM tasks t 
  INNER JOIN recursive_cte rc 
	ON t.depends_on_task_id = rc.task_id
WHERE rc.total_time < 20) 
SELECT 
  task_name, 
  time_required as total_time
FROM recursive_cte;

This recursive CTE starts with the anchor member, which selects the row in the tasks table with task_id = 2 (Development). In the recursive member, we join the tasks table with the recursive_cte CTE on the depends_on_task_id and task_id columns to find all the tasks that depend on Development. We calculate the total time required for each task by adding the time required for the current task to the total time required for its previous tasks.

Finally, we query the results by task_name and the time_required for each task. Here are the results:

Task_nameTotal_time
Development10
Testing15
Deployment17

The task_name column shows the name of each task, and the total_time column shows the total time required to complete the task and all its previous tasks. For example, the first row shows that the total time required to complete Development and all its dependencies is 15, which is the sum of the time required for Development (10) and Testing (5).

The other rows show the same concept, with the task_name and total_time columns indicating the name of each task and its total time requirement.

Learn More About Recursive CTEs in SQL Server

Recursive CTEs in SQL Server can be useful in a variety of scenarios such as:

  • Employee hierarchies: A recursive CTE can be used to traverse a tree of employees and their managers to calculate the total compensation of all employees in a given branch.
  • Product categories: A recursive CTE can be used to traverse a tree of product categories to find all the products in a given category and its subcategories.
  • Social networks: A recursive CTE can be used to traverse a graph of social network connections to find all the friends of a given user and their connections.
  • Task dependencies: A recursive CTE can be used to traverse a tree of task dependencies to calculate the time required to complete a project.

In general, any scenario where data is organized hierarchically or in a graph structure can potentially benefit from using a recursive CTE. Follow best practices like starting with the anchor member, ensuring both members have the same number of columns and data types, and adding a termination check. This will help you write efficient queries to handle hierarchical data and solve complex problems.

Take your skills to the next level with our Recursive Queries in MS SQL Server course. This interactive course offers a comprehensive learning experience, covering everything from the basics to advanced recursive CTE techniques. You'll work through 112 interactive, hands-on exercises designed to help you understand and apply recursive queries effectively.

Recursive Queries in MS SQL Server

This awesome SQL Server course also comes with a free trial, so you can explore the content risk-free. By the end, you'll be ready to handle complex data challenges with confidence. Don't wait—sign up for the free trial today and see how mastering recursive CTEs can transform your SQL skills!