13th Jun 2024 14 minutes read SQL CTEs: A Complete Overview of Common Table Expressions Gustavo du Mortier recursive queries Common Table Expressions Table of Contents CTE Syntax A Basic CTE Example Multiple CTEs in a Single Query Recursive Queries The Benefits of Common Table Expressions Facilitating Query Organization and Clarity Multi-Level Aggregations Querying Hierarchical Data Have Fun Drawing Pictures with CTEs Alternatives to SQL CTEs CTEs vs. Subqueries CTEs vs. Views CTEs vs. Temporary Tables CTEs in Different DBMSs Putting Your Knowledge of SQL CTEs Into Practice The Importance of Knowing About CTEs in SQL This article provides a detailed guide to everything you need to know about common table expressions (CTEs), from the basic definition to the most complex recursive queries. If the database you use supports CTEs, here’s everything you need to get the most out of it. Common table expressions – also called CTEs, WITH clauses, or recursive queries (although this last name is actually a specific application) – are a relatively new feature of SQL. They aim to simplify complex queries, making them easier to read, understand, and maintain. You may want to start by reading about what a common table expression is to get the full picture. Then, buckle up and get ready, because this article will drive you through all the different CTE resources (courses, exercises, examples, and explanations) available on our LearnSQL.com website. CTEs are useful for breaking down large, complex queries into smaller, more understandable parts – much like views, subqueries, and temporary tables. And – also views, subqueries, and temporary tables – you can combine CTEs to arrive at a final result. However the advantage of a CTE is that they do not generate objects that persist or occupy space in the database (as do views and temporary tables). And they are easier to read and interpret than subqueries. If you take our Recursive Queries course, you will have a complete overview of SQL CTEs. In it, you’ll learn CTE syntax and the different ways to use it. You’ll get 100+ interactive exercises that you can run freely on our online database. All you need is to have a basic knowledge of SQL and a browser with an Internet connection. CTE Syntax The general form of an SQL CTE begins with a WITH clause followed by a query definition (just a normal SELECT statement) that is assigned a name. Following that definition is a SELECT statement that refers to the CTE query by the name assigned to it, just as if it were a table or a view. For example: WITH cte_name AS (cte_query_definition) SELECT * FROM cte_name; The WITH clause heads the query and contains a subquery to which a name is assigned. The main query (located after the WITH clause) is a normal SELECT statement that could use the named subquery as many times as needed. When you execute a query that contains CTEs, the database engine first executes the CTE query(s), saving the CTE query results until it finishes executing the entire query; the CTE results could be considered intermediate results. It uses those intermediate results like tables to compose the final result. It then returns the final result and discards the intermediate results that it generated previously. A Basic CTE Example Let’s see a simple example. You have a schema with three tables: employee, division, and payment. You need to get a list of the employees with their maximum payment and the division to which each employee belongs. You can create a CTE that first resolves the maximum pay for each employee in a subquery called max_paid. Then, in the main SELECT, you join max_paid with employee and division to get the final result: WITH max_paid (employee_id, max_payment) AS ( SELECT emp.employee_id, MAX(pay.payment) AS max_payment FROM employee AS emp INNER JOIN payment AS pay ON pay.employee_id = emp.employee_id GROUP BY emp.employee_id ) SELECT emp.employee_id, emp.name AS employee_name, div.name AS division_name, mp.max_payment FROM max_paid mp INNER JOIN employee AS emp ON emp.employee_id = mp.employee_id INNER JOIN division AS div ON div.division_id = emp.division_id; When the database runs this query, it first builds a temporary dataset with the results of the CTE query and names it max_paid. Then, in the outer SELECT, it uses max_paid just as if it were a table. It joins max_paid together with the real tables in the schema (employee and division) to build the final result. For a deeper understanding of SQL CTE syntax, I recommend these 5 practical examples of WITH clauses. Multiple CTEs in a Single Query More than one CTE can be used in the same query. The way to do this is simply to separate each CTE definition with a comma: WITH cte1 AS ( SELECT ... FROM ... ), cte2 AS ( SELECT ... FROM ... ) SELECT ... FROM cte1 JOIN cte2, ... Each CTE can make use of other previously-defined CTEs in the same query; this is called nesting. This allows CTEs to decompose large and complex queries into smaller (and more manageable) subqueries. You can gradually solve each part of a problem until it is simplified into one final SELECT. The following example uses two nested CTEs that get information from a table called sales. The first CTE, SalesByDayRegion, obtains sales totals by day and by region. The second, SalesByDay, builds on SalesByDayRegion to get sales totals for each day. The final SELECT joins both CTEs to calculate the percentage of sales for each region in relation to the totals per day. Here’s the query: WITH SalesByDayRegion AS ( SELECT day, region, SUM(amount) AS AmountByDayRegion FROM Sales GROUP BY day, region ), SalesByDay AS ( SELECT day SUM(GroupedAmount1) AS AmountByDay FROM Sales GROUP BY day ) SELECT sdr.day, sdr.region, AmountByDayRegion AmountByDayRegion / AmountByDay AS Percentage FROM SalesByDayRegion sdr INNER JOIN SalesByDay sd ON sdr.day = sd.day; Recursive Queries WITH clauses – that is, CTEs – give SQL the ability to implement recursion. This eliminates the need to do so through other programming languages, thus achieving greater efficiency in query execution. In any programming language, recursion is implemented by having a function or procedure call itself repeatedly until a termination condition is met. Each recursive iteration adds to or modifies the data resulting from the previous iteration and provides its results to the next iteration. When the termination condition is met, it returns the final result. In SQL, recursion is possible because CTEs can reference themselves. In most SQL dialects (except Transact SQL), the word RECURSIVE is used after WITH to indicate recursion. You can read more about what a recursive CTE is here. Recursive CTEs are composed of the following elements: an anchor member, a recursive member, a termination check, and an invocation. The anchor member establishes the starting point for recursion. This part of the CTE must be solvable without invoking itself. The recursive member uses the results of a previous iteration to process or add more data to the results, obtaining a new result set for the next iteration – or, when the termination condition is met, a result set for the invocation member. The anchor member and the recursive member are combined using a UNION operator. This means the result sets of both members must have the same column structure: their columns must be in the same order and must have the same data types. In the following example, we use a recursive CTE to calculate the factorial of the numbers 1 to 5: WITH RECURSIVE factorial(n, factorial) AS ( SELECT 1, 1 UNION ALL SELECT n + 1, (n +1) * factorial FROM factorial WHERE n < 5 ) SELECT * FROM factorial; In this example, the anchor member of the recursive CTE is: SELECT 1, 1 And the recursive member, along with the termination condition, is the following SELECT: SELECT n + 1, (n +1) * factorial FROM factorial WHERE n < 5 The invocation is simply a SELECT * of the CTE factorial. Note that this SELECT treats factorial as if it were a table. If we are only interested in seeing the first three rows of the result of the factorial calculation, we could add this to the invocation: SELECT * FROM factorial WHERE n <= 3; This example is extremely simple. Recursive CTEs can be used to solve more complex needs, such as traversing a data set that represents a tree structure. You might use this to explore a company’s organizational chart – as we will see in an example below. If you want to dig deeper into recursivity in SQL, check out our Recursive Queries online course. The Benefits of Common Table Expressions Common table expressions have many benefits, from improving queries’ understandability to traversing hierarchical data structures. You can even draw pictures with SQL. As a starting point, I suggest you read the articles already mentioned, CTEs explained with examples and 5 practical examples of CTEs in SQL. Then, read along to see when it is useful and effective to use CTEs. Facilitating Query Organization and Clarity CTEs allow you to solve complex queries by breaking them down into several smaller, more manageable and easier to read queries. In turn, the ability to use nested CTEs allows you to move gradually toward a solution. A CTE is equivalent to a subquery, but with a very important advantage: you can give a CTE a name and use it repeatedly in different parts of your query. It is as if you were creating a view – with the difference that the view is a perennial database object. The CTE exists only as long as you execute the query; then it disappears without a trace. The following example shows how to create a CTE to get summary data from a sales table and then use that CTE to make different queries: WITH TotalSalesByCategory AS ( SELECT customer_category, SUM(amount) AS total_sales FROM sales GROUP BY customer_category ) SELECT 'big customers' AS grouping, SUM(total_sales) AS grand_total FROM TotalSalesByCategory WHERE customer_category IN ('A', 'B', 'C') UNION SELECT 'medium customers' AS grouping, SUM(total_sales) AS grand_total FROM TotalSalesByCategory WHERE customer_category IN ('D', 'E', 'F') UNION SELECT 'small customers' AS grouping, SUM(total_sales) AS grand_total FROM TotalSalesByCategory WHERE customer_category IN ('G', 'H', 'I'); Learn how to declutter your queries in these articles: How to Organize SQL Queries with CTEs Improving Query Readability with Common Table Expressions Multi-Level Aggregations We’ve seen different options for including multiple CTEs in the same query. By doing so, you multiply the power of CTEs to reduce the complexity of a query and improve its readability. The options for using multiple CTEs in SQL boil down to: Multiple independent CTEs. Nested CTEs. Use a recursive CTE along with non-recursive ones. Learn how to take advantage of combined CTEs with the following articles: How to Write Multiple CTEs in SQL How to Use 2 CTEs in a Single SQL Query Can Multiple WITH Statements Be Used in SQL? Querying Hierarchical Data Hierarchical data is characterized by having an ordered relationship between its elements. In general, hierarchical data is organized in levels; you can find out which data is “above” or “below” other pieces of information. In relational databases, hierarchical relationships can be established within a table by making each row contain a field that allows it to be related to a higher-level row within the same table. In this way, it is possible to build hierarchical data structures with “parent” rows and “child” rows, which are usually represented visually as a tree-like structure. CTEs are also used to query hierarchical data. Find out how by reading more about querying hierarchical data in SQL. A typical example of a hierarchical data structure is the organizational chart of a company, where the rows of the table contain employee data and each employee refers to their boss. For more details, read how to find all the employees of each manager in SQL. Have Fun Drawing Pictures with CTEs Looking for a fun way of writing ingenious queries in SQL? You can’t make great works of art using SQL, but you can make schematic drawings using recursive CTEs. We mentioned earlier that you can use recursion to traverse hierarchical tree-like data structures, such as an organizational chart. You may be surprised to learn that you can draw as well as traverse hierarchical structures with SQL. To find out, read this article on using SQL to draw a rather unique Christmas tree. Alternatives to SQL CTEs Leaving aside recursion (which in SQL is a unique virtue of CTEs), the rest of the benefits provided by CTEs can be achieved with subqueries, views, and temporary tables. Let’s compare each of these to a CTE. CTEs vs. Subqueries Subqueries in SQL must be written in parentheses and included with the rest of the elements of the main query. This means they don’t give much clarity to the query. CTEs are written separately, at the beginning of the query and inside their own WITH clause. This makes reading the query much easier. If you are not interested in seeing the CTE’s inner workings, you can overlook the WITH clause and only read the main SELECT to understand the results. You can get more details about the differences between CTEs and subqueries by reading this article that thoroughly differences between CTEs and subqueries. Also, subqueries are not reusable. If you want to use the result of a subquery in different parts of a query, you will have to write it again every time. Meanwhile, CTEs can be reused as many times as desired in the main SELECT or in other CTEs of the same query. This improves ease of reading and query performance; each CTE is executed only once and its results are available throughout the query. Finally, it’s good to note that subqueries within the FROM clause carry an alias that is used in the rest of the query to access their results. But other subqueries that are included in the column list of a SELECT or WHERE clause do not carry aliases. CTEs always carry a name that is used to reference it throughout the rest of the query. CTEs vs. Views Views have much in common with CTEs, but there is a crucial difference between them. Views remain as objects in the database until someone removes them. CTEs exist only in the context of their query. Once the query finishes executing, the CTE disappears. Read this article on the differences between CTEs and views to learn more. CTEs vs. Temporary Tables Temporary tables also have many similarities to CTEs. The lifecycle of a temporary table ends with the database session in which the temp table was created, so you don’t have to worry about leaving unnecessary objects cluttering up the database schema. But there is a fundamental difference: the creation of the temporary table must be done in a separate command from the query that uses it. You cannot create a temporary table inside a SELECT query; if you did, it would be a subquery, not a temporary table. If you create a temporary table and then realize that you need to add a column to it or change anything else, you must delete it and recreate it. With a CTE, you only have to make the change in its definition and re-run the query. You can read in more detail about differences between CTEs and temporary tables in our article CTE vs. Temporary Table: What’s the Difference?. CTEs in Different DBMSs CTEs appeared in the 1999 SQL standard. Their implementation in each of the most popular database management systems (DBMSs) was as follows: PostgreSQL: version 8.4, July 2009 MS SQL Server: version 2012, May 2012 Oracle Database: version 9.2, June 2002 MySQL: version 8.0, April 2018 MariaDB: version 10.2.2, May 2018 Google BigQuery also offers support for CTEs. As it runs on the Cloud as a fully managed data warehouse, it is enough to know that it currently offers support for CTEs. The SQL CTE syntax is the same for all DBMSs. The only exception is in MS SQL Server (T-SQL), where the syntax for writing a recursive CTE does not require using the RECURSIVE keyword. You can learn more about CTEs in MS SQL Server by reading about recursion in SQL Server and viewing CTE examples in SQL Server. You can read in more detail about CTEs in MySQL and CTEs in PostgreSQL on our blog. Putting Your Knowledge of SQL CTEs Into Practice Throughout this article, you have learned all about common table expressions in SQL. Now you need to put that knowledge into practice. The best way to do this is to take one of our specific courses on CTEs and recursive queries, where you will find tons of interactive exercises. These courses are geared toward students, intermediate database analysts, and beginning data scientists. The only requirement is some mastery of basic SQL like SELECT statements, JOINs, GROUP BY, etc. You can opt for our standard Recursive Queries course, which is SQL dialect-agnostic, or choose a specific course for the DBMS of your choice: Recursive Queries in MS SQL Server Recursive Queries in MySQL 8 Recursive Queries in PostgreSQL None of these courses require you to use your own databases. Our platform provides you with a ready-to-use database, complete with schema and populated with information. These practice environments are safe; you don’t have to worry about making mistakes, as there is no chance of breaking anything. Check out these other valuable resources to strengthen your knowledge of WITH queries: Top 5 SQL CTE Interview Questions 11 SQL Common Table Expression Exercises The Importance of Knowing About CTEs in SQL Mastering SQL’s common table expressions (CTEs) and recursive queries begins with grasping their basics and then progressing to more complex applications. Our curated resources make learning CTEs both enjoyable and practical. To maintain your CTE skills, make it a habit to practice daily. When faced with complex queries, consider if using one or more CTEs could simplify the task by breaking it into smaller, more manageable parts. Soon, you might find yourself using CTEs routinely in your SQL tasks. Additionally, I recommend checking out our Recursive Queries course to further enhance your understanding of SQL CTEs. It will equip you with the tools to confidently tackle even the most intricate queries. Happy learning! Tags: recursive queries Common Table Expressions