28th Aug 2020 6 minutes read

How CTEs Work

Table of Contents

CTE: The Basics
A Simple CTE Example
CTEs vs. Subqueries
Recursive CTEs
CTEs Are Useful

The SQL language offers a feature named Common Table Expressions, or CTEs. Also known as WITH clauses, CTEs are a fairly new addition to SQL. They help you break longer queries into smaller chunks, making your queries much easier to understand. Read the article to find out how to use CTEs and how they differ from traditional subqueries.

Common Table Expressions (CTEs), which are essentially named subqueries, were first mentioned in the SQL standard between 1999 and 2000. However, it was some time before major database vendors implemented CTEs in their engines: PostgreSQL included them in one of their 2009 versions, while MySQL first published a compliant release in 2018. That means CTEs can be a fairly new tool even for experienced database users.

If this is the first time you are coming across Common Table Expressions, you may want to read this introductory article: What is a CTE?

In this article, we’ll explain the theory behind CTEs. We’ll then show you a simple practical example. Finally, we’ll discuss the pros and cons of CTEs and compare them to traditional subqueries. Let’s get started.

CTE: The Basics

As we stated above, the CTE or Common Table Expression is a fairly new feature in the SQL language. It is a temporary set of rows that you define yourself and then reuse in the same query. CTEs are like subqueries and are sometimes even called “named subqueries.” You may also come across them under the name “WITH queries.”

One thing to keep in mind is that CTEs are always placed at the beginning of an SQL query. The basic syntax for a CTE looks like this:

WITH cte_name AS (cte_definition)
SELECT …
FROM cte_name;

A CTE is introduced at the beginning of a query with a WITH keyword. It needs to be given a name (cte_name) that we refer to later on in the query. Then, we define the CTE body within the parentheses.

There is an alternative syntax where we explicitly define the columns of the CTE:

WITH cte_name (cte_column1, cte_column2...) AS
  (cte_definition)
SELECT …
FROM cte_name;

The column definition is not mandatory, so why would we want to do it? First of all, it increases the readability of our instruction because we clearly state what the output of the CTE will be. Second, simple columns inside CTEs may not require aliases, but aggregates and other function results do. If your CTE contains columns such as AVG(amount) or MIN(value), they must be given names so they can be referred to outside the CTE. One way is to use the keyword AS inside the SELECT clause of the CTE. Another way is to provide the list of columns. Either method is fine, but you’ll likely see an error if you don’t use either of them.

A Simple CTE Example

Enough theory! Let’s see CTEs in action. Suppose we have the following simple table named employee:

employee

id	full_name	department	manager_id	experience_years	salary
1	Alan Poe	IT	34	3	40,000
2	Kate Rosemary	Accounting	45	9	65,000
...

We will create a CTE named experienced_accountant that will include only the employees from the “Accounting” department who have more than 3 years of experience. Then, we’ll use that CTE to calculate the average salary for an experienced accountant by manager. Take a look:

WITH experienced_accountant AS (
  SELECT id, full_name, manager_id, salary
  FROM employee
  WHERE department = “Accounting” AND experience_years > 3
)
SELECT manager_id, AVG(salary) AS avg_salary_exp_accountant
FROM experienced_accountant
GROUP BY manager_id;

We define our CTE at the beginning of the query. The database engine runs the CTE in the background and stores its result in the memory. This way, the CTE acts as an additional table we can call in the FROM clause. As a result, we’ll see the ID of each manager alongside the average salary of experienced accountants they manage. Indeed, when we run the query, we’ll see something along the lines of:

Result

manager_id	avg_salary_exp_accountant
34	67,830
147	81,220
258	80,000
...	...

Our CTE worked like a charm. The entire query is also nicely organized; the logic used to find experienced accountants is clearly separated from the main query.

CTEs vs. Subqueries

As we mentioned, CTEs are similar to subqueries. Both are “queries within queries” so why would you prefer CTEs over subqueries?

Generally speaking, CTEs are much more readable than subqueries. First, they are named, so you can clearly indicate what the CTE contains by its name. In the example above, our CTE was named experienced_accountant. We can figure out what it contains without even looking at its definition.

Second, since CTEs always appear at the beginning of a query, it also adds an extra layer of query organization. Especially with very long queries, using CTEs at the beginning makes the main part of the query more readable, compared to using subqueries scattered around the entire instruction.

If you’d like to read more about CTEs and the readability of your queries, take a look at this article.

One thing to keep in mind is the execution speed. You may think that CTEs are faster than subqueries, but they are not. The result of a CTE is not materialized (stored) anywhere. This means that if you refer to the same CTE for the second time in a query, it will be executed again. The execution speed will be the same for a CTE and a subquery in this case.

The advantage of a CTE is that you need to provide its definition only once. A subquery would have to be redefined in its entirety if you want to refer to it again elsewhere in your query.

Recursive CTEs

Some databases also allow the concept of recursive CTEs. They let us tackle complex computational problems that are impossible to solve with simple SQL instructions.

For instance, recursive queries can help find the shortest path between two places when multiple routes are available. We can also use them to present hierarchical structures, such as the hierarchy of employees in a big company.

This is an advanced topic outside the scope of this article, but if you’d like to find out more, take a look at our article available here. Note that not all databases support recursive CTEs, so first make sure to check if yours is compliant.

We also have a separate interactive course dedicated to recursive queries if you’re interested. Take a look at our Recursive Queries course page to find out the details. The course is also explained in detail in this blog article.

CTEs Are Useful

As we have seen, Common Table Expressions are similar to subqueries, but CTEs are much more readable and generally more advisable over subqueries. If you’d like to learn and practice more in detail, take a look at LearnSQL.com, where we offer an interactive course dedicated to CTEs.

Tags:

CTE: The Basics

A Simple CTE Example

CTEs vs. Subqueries

Recursive CTEs

CTEs Are Useful

You may also like