Back to articles list Articles Cookbook
12 minutes read

Top 5 SQL CTE Interview Questions

Here are five questions (and solutions!) on CTEs you’ll (probably) be asked at an interview.

I’m not saying you will get these SQL CTE interview questions at every job interview. But when you do, they are probably along the lines of the five I’m about to show here.

Other than theoretical questions about CTEs, there are not that many variations in the CTE scenarios evaluated by interviewers. Go through these five examples, and you get a good foundation for acing your interview!

For this, you have to know what SQL common table expressions are and how their syntax works.

Ready to take a look at the questions? Let’s go!

Table for Questions 1, 2, and 3

For the first set of questions, we’ll use the table employees. Here’s what it looks like:

idfirst_namelast_namedepartmentsalarymanager_id
1AngelikaVoulesMarketing5,293.742
2RozelleSwynleyMarketing8,295.0818
3WarrenWilleyEngineering9,126.7219
4LynelleWhitenManagement Board10,716.15NULL
5ConsolataRomanLegal8,456.064
6HoebartBaldockResearch and Development4,817.3420
7StarleneWatkissAccounting6,541.484
8BardeRibbensMarketing4,852.872
9LornePhilipsenEngineering7,235.593
10PedroNaldrettResearch and Development5,471.6220
11BrinaDillingerMarketing6,512.172
12VerileSonleyResearch and Development4,574.4120
13NobleGeerlingResearch and Development8,391.1820
14GareyMacAdamAccounting3,829.887
15TheoSorrellEngineering6,441.673
16ErminieGellingResearch and Development8,590.7020
17LoralieKoopAccounting5,248.467
18CalAndreyManagement Board11,258.82NULL
19QuinceyGamellManagement Board11,366.52NULL
20JanithMcGiffieResearch and Development7,428.8319

What does the data tell you? The first row, for example, says Angelika Voules works in Marketing, and her salary is 5,293.74. Her boss has a manager ID of 2; look it up under the column id, and you see Angelika Voules’ boss is Rozelle Swynley.

There are three rows with NULL values in the manager_id column:

idfirst_namelast_namedepartmentsalarymanager_id
4LynelleWhitenManagement Board10,716.15NULL
18CalAndreyManagement Board11,258.82NULL
19QuinceyGamellManagement Board11,366.52NULL

It means these three employees have no managers. This makes sense since all three of them are on the management board.

Now, let’s see the interview questions.

Interview Question 1: Find the Average Salary by Department

Using the table employees, show all the employees, their departments, salaries, and the average salary in their respective department. Order the result by department.

Solution Query

WITH avg_salary AS (
		SELECT	AVG(salary) AS average_salary,
				department
		FROM employees
		GROUP BY department)
		
SELECT	e.first_name,
		e.last_name,
		e.department,
		e.salary,
		avgs.average_salary	
FROM employees e
JOIN avg_salary avgs
ON e.department = avgs.department
ORDER BY department;

Solution Query Explanation

This query uses a CTE named avg_salary to calculate the average salary by department. The SELECT statement in the CTE groups the rows by department and uses the AVG() aggregate function to calculate the average for each department.

Once we have that, we combine it with other columns from the table employees to complete the answer to the interview question. To do so, we join the table employees with the CTE as we would do with any two tables. We select the columns first_name, last_name, department, and salary from the table employees, and the column average_salary from the CTE. For convenience, we use aliases for the table and the CTE. Finally, we order the result by department.

The Result Table

Here’s what the result looks like:

first_namelast_namedepartmentsalaryaverage_salary
GareyMacAdamAccounting3,829.885,206.61
LoralieKoopAccounting5,248.465,206.61
StarleneWatkissAccounting6,541.485,206.61
WarrenWilleyEngineering9,126.727,601.33
LornePhilipsenEngineering7,235.597,601.33
TheoSorrellEngineering6,441.677,601.33
ConsolataRomanLegal8,456.068,456.06
CalAndreyManagement Board11,258.8211,113.83
QuinceyGamellManagement Board11,366.5211,113.83
LynelleWhitenManagement Board10,716.1511,113.83
AngelikaVoulesMarketing5,293.746,238.47
RozelleSwynleyMarketing8,295.086,238.47
BardeRibbensMarketing4,852.876,238.47
BrinaDillingerMarketing6,512.176,238.47
NobleGeerlingResearch and Development8,391.186,545.68
ErminieGellingResearch and Development8,590.706,545.68
VerileSonleyResearch and Development4,574.416,545.68
PedroNaldrettResearch and Development5,471.626,545.68
HoebartBaldockResearch and Development4,817.346,545.68
JanithMcGiffieResearch and Development7,428.836,545.68

The Result Explanation

This table tells us that, for example, Starlene Watkiss’ salary is 6,541.48, and the average salary in her department (Accounting) is 5,206.61.

first_namelast_namedepartmentsalaryaverage_salary
StarleneWatkissAccounting6,541.485,206.61

You can interpret the remaining rows in the result in the same way.

Interview Question 2: Find the Highest Salary by Department

Find the employee with the highest salary in each department. Show their first and last names, salaries, and departments.

Solution Query

WITH highest_salary AS (
		SELECT	first_name,
				last_name,
				department,
				salary,
				RANK () OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
		FROM employees)

SELECT	first_name,
		last_name,
		salary,
		department
FROM highest_salary
WHERE salary_rank = 1;

Solution Query Explanation

This time, the CTE is named highest_salary. We use it to rank employees within each department by salary. How? By using the window function RANK().

We partition the data by the column department. This means we rank the salaries only within each department, not across the company overall. The data is sorted according to the column salary in descending order because we want the highest salary in the department to have the rank of 1.

We also select several columns from the table employees in the same CTE. We need data from this CTE in the next SELECT statement.

And in that SELECT statement, we select the columns needed to answer the interview question. All the columns are from the CTE. We filter the result using a WHERE clause to get only the rows where the salary rank is 1, i.e., the rows with the highest salary.

The Result Table

first_namelast_namesalarydepartment
StarleneWatkiss6,541.48Accounting
WarrenWilley9,126.72Engineering
ConsolataRoman8,456.06Legal
QuinceyGamell11,366.52Management Board
RozelleSwynley8,295.08Marketing
ErminieGelling8,590.70Research and Development

The Result Explanation

It isn’t too difficult to interpret the result. Let’s take this employee as an example:

first_namelast_namesalarydepartment
ConsolataRoman8,456.06Legal

The data shown tells us Consolata Roman has the highest salary in the Legal department at 8,456.06.

Let’s move on to the third SQL CTE interview question.

Interview Question 3: Find All Employees Under a Specific Manager

Find all employees working directly or indirectly under the employee whose ID is 18.

Solution Query

WITH RECURSIVE subordinates AS (
		SELECT	id,
				first_name,
				last_name,
				manager_id
		FROM employees
		WHERE id = 18

	UNION

		SELECT	e.id,
				e.first_name,
				e.last_name,
				e.manager_id
		FROM employees e
JOIN subordinates s
ON e.manager_id = s.id
)
		
SELECT *
FROM subordinates
WHERE id != 18;

Solution Query Explanation

In this example, it is not just a regular CTE – rather, we use a recursive CTE to get the result. The difference in syntax is WITH RECURSIVE instead of just WITH. The CTE is named subordinates, and then comes the SELECT statement just like in non-recursive CTEs.

This statement selects certain columns from the table employees but only for the employee whose ID is 18. Then, we use the UNION operator to link the result of this SELECT statement with the result of another SELECT statement. For this, both SELECT statements need to have the same columns.

The second SELECT statement in the CTE selects columns where manager_id (from the table employees) is equal to the id (from the CTE).

We get all the data for the employee whose ID is 18. Then, we find the direct subordinates, and by recursion, we find the subordinates’ subordinates, until we go all the way down the hierarchy of the organization.

Then we get the columns from the CTE subordinates and remove the employee whose ID is 18. That’s it!

The Result Table

idfirst_namelast_namemanager_id
2RozelleSwynley18
1AngelikaVoules2
8BardeRibbens2
11BrinaDillinger2

The Result Explanation

Rozelle Swynley’s direct boss is the employee whose manager’s ID is 18. But Rozelle has subordinates, too. They are Angelika Voules, Barde Ribbens, and Brina Dillinger. We know this because the table shows their manager’s ID is 2, which is Rozelle Swynley’s ID. They are Rozelle Swynley’s direct subordinates; they are also indirect subordinates of the employee whose ID is 18. The employee whose ID is 18 is Cal Andrey, who is not shown in the result because we filter it out in the WHERE clause.

Tables Used for Questions 4 & 5

For the last two interview questions, we’ll use the tables customers and orders.

The table customers looks like this:

idfirst_namelast_name
1SimonPaulson
2DylanBobson
3RebMackennack

The orders table is a little bigger:

idorder_dateorder_amountcustomer_id
12021-10-0142.123
22021-10-01415.631
32021-10-0284.992
42021-10-0228.963
52021-10-0254.311
62021-10-0374.261
72021-10-0377.772
82021-10-0355.703
92021-10-0416.943
102021-10-0451.441
112021-10-0541.583
122021-10-0695.001

This second table contains data about the orders placed. The column customer_id is a foreign key to the primary key of the first table, so we can identify which customer placed which order. Let’s take the first row as an example:

idorder_dateorder_amountcustomer_id
12021-10-0142.123

It’s an order placed on October 1, 2021. The value of the goods ordered is 42.12, and the order was placed by the customer whose ID is 3. From the table customers, you see it is Reb Mackennack.

Let’s see the tasks to be solved using these tables.

Interview Question 4: Find the Average Number of Orders

This interview question asks you to use a SQL common table expression to find the average number of orders per customer.

Solution Query

WITH orders_count AS (
		SELECT	customer_id,
				COUNT(*) AS no_of_orders
		FROM orders
		GROUP BY customer_id)

SELECT	AVG(no_of_orders) AS avg_no_of_orders
FROM orders_count;

Solution Query Explanation

To get the result, you first count the number of orders by customer. We do this using the CTE orders_count and the COUNT() function in it. This function counts the number of rows in the table orders. Since we're interested not in the total count of orders but rather in the number of orders by customer, I group the result by the column customer_id.

Now that we have the number of orders, it’s easy to calculate the average number. Simply use the AVG() function in the SELECT statement, and you get the answer to the interview question.

The Result Table

avg_no_of_orders
4

The Result Explanation

A very small table requires a very short explanation. The above table simply shows the average number of orders per customer is four.

Interview Question 5: Find the Number of Consecutive Days With Order

In this CTE example, you have to calculate how many days in a row each customer places an order. It basically asks you to calculate the length of a contiguous series without gaps. Note that all the orders are within the same month.

Solution Query

WITH groupings_by_date AS (
	SELECT	c.id,
			c.first_name,
			c.last_name,
			RANK() OVER (PARTITION BY c.id ORDER BY o.order_date) AS row_number,
			o.order_date,
			EXTRACT(DAY FROM o.order_date) - RANK() OVER (PARTITION BY c.id ORDER BY o.order_date) AS date_group
FROM customers c
JOIN orders o
ON c.id = o.customer_id
)

SELECT	id,
		first_name,
		last_name,
		COUNT(*) AS orders_in_row
FROM groupings_by_date
GROUP BY id, first_name, last_name, date_group;

Solution Query Explanation

This rather lengthy query starts with a CTE. It selects certain columns from the tables customers and orders. There’s a RANK() window function as in Question 2. This time, we use this function to allocate a row number to every order within the same customer. For that reason, we partition the data by the column id. The ranking is performed according to the order date (we want the ranking to be sequential).

The other function used here is EXTRACT(). Its purpose is to extract the day part from the order date so that we can deduct the row number from it.

Why are we doing this? We're simply giving a group of consecutive orders a common date_group. If you run just this CTE, the first two rows of the result look like these:

idfirst_namelast_namerow_numberorder_datedate_group
1SimonPaulson12021-10-010
1SimonPaulson22021-10-020

As you can see, Simon Paulson placed orders on both October 1 and October 2, 2021. Since they were placed two days in a row, they belong to the same date_group.

How do we get this? It’s the day extracted from the order_date minus the row_number.

In the first row, the day from the date ‘2021-10-01’ is 1. The row number is also 1. So, it’s 1-1 = 0, which is also the value in the date_group. The second row is 2-2 = 0, which is the same date_group as above.

The specific value of date_group doesn’t really matter! It matters only that the consecutive days have the same date_group value. This is a little trick to calculate the length of a series. It works because, if the orders are placed every day, then the difference between the number of days in the order_date and value in the row_number is always the same for the same customer.

Note that this data does not appear anywhere. I’m just showing it to you for an easier explanation of what the CTE does here.

It’s important to mention this trick works only if your data is all within the same month. If the series, say, starts on 2021-10-31 and goes to 2021-11-01, the trick does not work; these two days, although they’re one after another, won’t belong to the same date_group. So, you have to understand your data before you decide to use this trick to get the length of a series.

Once you have the CTE, use it like a table in the SELECT statement. I’m going to use the COUNT() function to get the number of the orders in a row. I want to see the result for every customer, so I group the data by id, first_name, and last_name. I’m also grouping it by the column date_group, so the orders placed by the same customer get separated if there is a gap between them.

The Result Table

idfirst_namelast_nameorders_in_row
1SimonPaulson4
3RebMackennack5
2DylanBobson2
1SimonPaulson1

The Result Explanation

This table shows Simon Paulson has placed four orders in a row. Reb Mackennack has done that five days in a row, while Dylan Bobson only has two orders in a row. Finally, there’s one additional order by Simon Paulson.

When Are CTEs Useful?

As you see in these examples, CTEs are very useful when you have to calculate something in at least two steps. You use a CTE for preparing the preliminary calculation, and then you simply use the SELECT statement, reference the CTE, and perform another level of calculation.

In more complex calculations, you can also use several CTEs or even nested CTEs. It’s similar to subqueries, but CTEs make code more readable and easier to break down the calculation into steps. And with nested CTEs, you can write a query and immediately reference it like any other table.

Some of these examples show you can also use aggregate and window functions. This makes CTEs more powerful.

In addition, the CTEs are great for hierarchical data like organizational structures and for traversing a graph. However, you need to write a recursive CTE in these cases as I did in Question 3.

Finally, if you want to write recursive queries and use them, the first step is to learn CTEs. If you need help, here’s a guide on how to approach learning CTEs.

As you can see, CTEs have plenty of benefits and practical uses. These examples show you only a glimpse into solutions to practical problems that take much longer if it weren’t for CTEs. You want some more? No problem, take a look at some other practical uses of CTEs.

Once you learn CTEs and start using them, try to think about the problem first. Break down the calculation steps, and then translate this logic into an SQL common table expression code. CTEs are suitable for this.

Since they are a version of temporary tables, be clear when naming them: use a name that immediately tells what the CTE does and be consistent with naming conventions. Generally, you want to follow CTE best practices, which make your (and others!) code reading much easier.

Want to Know More About SQL Common Table Expressions?

Whether you’re preparing for an SQL job interview, want to be better at your current job, or simply want to learn something new in your free time, take a look at our Recursive Queries course.

There, you find a systematic approach to CTEs, explaining the theory behind them, showing you the syntax, and giving you some more examples to practice. The whole course is part of the Advanced SQL track, which covers two additional topics: window functions (which we’ve used in this article) and the GROUP BY extensions. There are many ways in which you can practice these advanced SQL topics on our platform. Good luck!