18th Aug 2021 10 minutes read 5 Practical SQL CTE Examples Kateryna Koidan sql learn sql CTE Table of Contents Common Table Expressions in SQL SQL CTE Examples Example 1 Example 2 Example 3 Example 4 Example 5 Let’s Practice SQL CTEs! Common table expressions (CTEs) were introduced into SQL to improve the readability and the structure of SQL queries, especially those requiring multiple steps to get the necessary output. In this article, we will go through several examples to show how SQL CTEs can help you with complex calculations and hierarchical data structures. Common Table Expressions in SQL Common table expressions (CTEs), also called WITH clauses, allow creating named subqueries that are further referenced in the main query. CTEs were introduced in SQL to improve the readability and the structure of an SQL statement. The basic CTE syntax is as follows: WITH subquery_name AS (SELECT … subquery ...) SELECT … main query ... We start with the WITH keyword followed by the name we assign to the CTE (subquery). Then, we put the AS keyword and include the subquery in parentheses. After the CTE is defined, we move on to the main query, where we can reference this CTE by its name. If you are new to CTEs, you may need to check out this article that explains in more detail how CTEs work. It is possible to have multiple CTEs in one query, reference one CTE within another (i.e., nested CTEs), or even reference a CTE within itself (recursive CTEs). This gives us a whole bunch of tools and opportunities. SQL CTE Examples To show how CTEs can assist you with various analytical tasks, I’ll go through five practical examples. We’ll start with the table orders, with some basic information like the order date, the customer ID, the store name, the ID of the employee who registered the order, and the total amount of the order. orders iddatecustomer_idstoreemployee_idamount 1012021-07-01234East11198.00 1022021-07-01675West13799.00 1032021-07-01456West14698.00 1042021-07-01980Center1599.00 1052021-07-02594Center161045.45 1062021-07-02435East11599.00 1072021-07-02246West14678.89 1082021-07-03256East12458.80 1092021-07-03785East1299.00 1102021-07-03443Center16325.50 Now, let’s write a couple of SQL queries! You may also practice SQL CTEs in this interactive Recursive Queries course that covers all kinds of CTEs. Example 1 In our first example, we want to compare the total amount of each order with the average order amount at the corresponding store. We can start by calculating the average order amount for each store using a CTE and adding this column to the output of the main query: WITH avg_per_store AS (SELECT store, AVG(amount) AS average_order FROM orders GROUP BY store) SELECT o.id, o.store, o.amount, avg.average_order AS avg_for_store FROM orders o JOIN avg_per_store avg ON o.store = avg.store; As you see, our query begins with a CTE called avg_per_store. Using this CTE, we create a table that lists all stores and the average order amount by store. Then, in the main query, we select to display the order ID, the store name, the order amount from the original orders table, and the average order amount for each store (avg_for_store) from the CTE defined earlier. Here’s the output: idstoreamountavg_for_store 101East198.00338.70 102West799.00725.30 103West698.00725.30 104Center99.00489.98 105Center1045.45489.98 106East599.00338.70 107West678.89725.30 108East458.80338.70 109East99.00338.70 110Center325.50489.98 With this table, we can see how each order compares to the average order amount at the corresponding store. Now, let’s move on to a more complex example. Example 2 Here, we’ll compare different stores. Specifically, we want to see how the average order amount for each store compares to the minimum and the maximum of the average order amount among all stores. As in our first example, we’ll start by calculating the average order amount for each store using a CTE. Then, we’ll define two more CTEs: To calculate the minimum of the average order amount among all stores. To calculate the maximum of the average order amount among all stores. Note that these two CTEs will use the result of the first CTE. Finally, in the main query, we’ll join all three CTEs to get the information we need: WITH avg_per_store AS ( SELECT store, AVG(amount) AS average_order FROM orders GROUP BY store), min_order_store AS ( SELECT MIN (average_order) AS min_avg_order_store FROM avg_per_store), max_order_store AS ( SELECT MAX (average_order) AS max_avg_order_store FROM avg_per_store) SELECT avg.store, avg.average_order, min.min_avg_order_store, max.max_avg_order_store FROM avg_per_store avg CROSS JOIN min_order_store min CROSS JOIN max_order_store max; As you see, even with multiple nested CTEs, the SQL query remains clean and easy to follow. If you were to use subqueries, you would need to nest one subquery within the other two and repeat it several times within the same query. Here, with CTEs, we simply define all three CTEs at the beginning then reference them when needed. Here’s the output of this query: storeaverage_ordermin_avg_order_storemax_avg_order_store Center489.98338.70725.30 East338.70338.70725.30 West725.30338.70725.30 You can easily see how each store compares with others in terms of the average order amount. Of course, when you have only three stores, we could just compare them without adding the min_avg_order_store and max_avg_order_store columns. However, when you need to analyze the performance of many stores by different metrics, this approach might be very helpful. Read this guide to learn the SQL CTE best practices. Example 3 In our next example, we’ll continue with comparing the performance of our stores but with a few different metrics. Let’s say our company considers orders below $200 to be small and orders equal or above $200 to be big. Now, we want to calculate how many big orders and small orders each store had. To address this task using WITH clauses, we need two common table expressions: To get the number of big orders for each store. To get the number of small orders for each store. Some stores may not have any big orders or any small orders, leading to NULL values. We need to make sure we don’t lose any stores during JOINs. For this reason, I prefer to have yet another CTE that simply outputs a list of all stores. Then, in the main query, we’ll join this CTE with the two CTEs containing the metrics on big and small orders: WITH stores AS (SELECT store FROM orders GROUP BY store), big AS (SELECT store, COUNT(*) AS big_orders FROM orders WHERE amount >= 200.00 GROUP BY store), small AS (SELECT store, COUNT(*) AS small_orders FROM orders WHERE amount < 200.00 GROUP BY store) SELECT s.store, b.big_orders, sm.small_orders FROM stores s FULL JOIN big b ON s.store = b.store FULL JOIN small sm ON s.store = sm.store; So, in this query, we: Define the CTE stores to get a full list of stores. Define the CTE big to calculate, for each store, the number of orders with the total amount equal to or above $200. Define the CTE small to calculate, for each store, the number of orders below $200. Join all three CTEs. Here’s the output: storebig_orderssmall_orders Center21 East22 West3NULL We can now see the West store performs really well; all of its orders are above $200. The Center store is also good, with two orders above $200 and one order below $200. Only half of the orders at the East store are big, with two orders above $200 and two orders below $200. Example 4 For the next two examples, we’ll use the table below with some basic information about the employees of our company. Specifically, we have the employee ID, the first name, the last name, the ID of the employee’s superior, the department, and the last bonus amount. employees idfirst_namelast_namesuperior_iddepartmentbonus 1JohnDaviesNULLCEO2545.00 2MarkTaylor1Finance1100.00 3KateWilson1Operations900.00 4OliviaWatson3Operations450.00 5JamesAddington1Sales1900.00 6RachaelWhite1Marketing1250.00 7SaraClinton6Marketing1000.00 11JohnSmith5Sales800.00 12NoahJones11Sales500.00 13StevenBrown5Sales900.00 14LiamWilliams13Sales700.00 15PaulLee5Sales500.00 16PatrickEvans15Sales500.00 Now, let’s calculate the average bonus by department, then count how many employees had bonuses above their respective department average and how many had below. Common table expressions can be very handy with such complex calculations. We’ll have three CTEs in this SQL query: To calculate the average bonus amount for each department. To calculate, by department, the number of employees whose bonuses were above their respective department average. To calculate, by department, the number of employees whose bonuses were below their respective department average. In the main query, we’ll join all three CTEs. WITH avg_bonus_department AS (SELECT department, AVG(bonus) AS average_bonus FROM employees GROUP BY department), above_average AS (SELECT e.department, count(*) AS employees_above_average FROM employees e JOIN avg_bonus_department avg ON e.department = avg.department WHERE bonus > average_bonus GROUP BY e.department), below_average AS (SELECT e.department, count(*) AS employees_below_average FROM employees e JOIN avg_bonus_department avg ON e.department = avg.department WHERE bonus < average_bonus GROUP BY e.department) SELECT avg.department, avg.average_bonus, aa.employees_above_average, ba.employees_below_average FROM avg_bonus_department avg LEFT JOIN above_average aa ON avg.department = aa.department LEFT JOIN below_average ba ON avg.department = ba.department; Here’s the result of the query: departmentaverage_bonusemployees_above_averageemployees_below_average CEO2545.00NULLNULL Marketing1125.0011 Finance1100.00NULLNULL Operations675.0011 Sales828.5725 Since there is only one person in Finance, the average bonus for the department is exactly equal to the bonus of this person. As a result, we have nobody in the Finance department whose bonus was either above or below average (reflected as NULL values in the result). The same applies to the CEO. For the Sales department, we can see that the average bonus was $828.57, and only two out of seven people had bonuses above the department average. We will leave you to interpret the results for the Marketing and Operations departments in the same way, and we will move on to an even more complex example with a recursive query. Example 5 Common table expressions can reference themselves, making them a perfect tool for analyzing hierarchical structures. Let’s see with an example. Using the information from the employees table and the orders table, we can draw the following organizational structure of our company. The store personnel are considered part of the Sales team. In addition, in the orders table, we can see which employees have orders in which stores, so we can derive the store to which each salesperson belongs. Now, let’s say we need to find out the level of each employee in the organizational structure (i.e., level 1 is the CEO, level 2 is for his direct reports, etc.). We can add a column that shows this with a recursive query: WITH RECURSIVE levels AS ( SELECT id, first_name, last_name, superior_id, 1 AS level FROM employees WHERE superior_id IS NULL UNION ALL SELECT employees.id, employees.first_name, employees.last_name, employees.superior_id, levels.level + 1 FROM employees, levels WHERE employees.superior_id = levels.id ) SELECT * FROM levels; As you see, the CTE levels in this query references itself. It starts with selecting the record corresponding to the big boss, the one who doesn’t have a superior (i.e., superior_id IS NULL). We assign 1 to the level of this person, then use UNION ALL to add other records, adding one to it for each level of management in the organizational structure. Here’s the output: idfirst_namelast_namesuperior_idlevel 1JohnDaviesNULL1 2MarkTaylor12 3KateWilson12 5JamesAddington12 6RachaelWhite12 4OliviaWatson33 7SaraClinton63 11JohnSmith53 13StevenBrown53 15PaulLee53 12NoahJones114 14LiamWilliams134 16PatrickEvans154 The topic of recursive queries is quite challenging, so I will not go into more details here. But please make sure to check out this article explaining recursive CTEs with examples, especially if you work with hierarchical data. And see this article for more SQL CTE examples. Let’s Practice SQL CTEs! I hope these examples have shown you how handy CTEs can be for different analytical assignments. They help to improve the readability and the structure of your SQL queries, help with nested and complex computations, and are useful for the efficient processing of hierarchical data. Learn more about when to use CTEs in this article. If you want to master common table expressions, I recommend starting with this unique Recursive Queries course. It includes 114 interactive exercises covering all kinds of CTEs, including simple CTEs, nested CTEs, and recursive CTEs. By the end of the course, you’ll know how to manage SQL queries with CTEs, how and when to nest CTEs, and how to use recursive CTEs to move through hierarchical data models. If you want to master other advanced tools for data analysis with SQL, you should consider taking the Advanced SQL track! It goes beyond CTEs and also covers window functions and GROUP BY extensions in SQL. And LearnSQL.com offers many different ways to practice these advanced SQL concepts online. Thanks for reading, and happy learning! Tags: sql learn sql CTE