Back to articles list April 10, 2018 - 5 minutes read Learn to Write a SQL Correlated Subquery in 5 Minutes Ignacio L. Bisso Ignacio is a database consultant from Buenos Aires, Argentina. He’s worked for 15 years as a database consultant for IT companies like Informix and IBM. These days, he teaches databases at Sarmiento University and works as a PostgreSQL independent SQL consultant. A proud father of four kids with 54 years in his backpack, Ignacio plays soccer every Saturday afternoon, enjoying every match as if it’s his last one. Tags: data analysis how to in sql SQL basics SQL subquery If you’re familiar with the famous Russian nesting doll, then SQL correlated subqueries should be a peace of cake to understand—subqueries are just queries nested within queries. An SQL subquery is often called an “inner” query; the main query is usually called the “outer” query. This article covers everything you need to know about correlated subqueries. What Exactly is a SQL Correlated Subquery? A correlated SQL subquery is just a subquery that is executed many times—once for each record (row) returned by the outer (main) query. In other words, the outer query returns a table with multiple rows; the inner query then runs once for each of those rows. If your outer query returns 10 rows, then the inner query will run 10 times. And if your outer query returns 100 rows, the inner query will run 100 times. When Should You Use a SQL Correlated Subquery? Some data questions can only be answered with correlated subqueries. This is particularly true when asking negative data questions. Negative data questions arise when we search for records that don’t satisfy a particular condition. An example of a simple negative data question is: “Obtain the names of all movies not produced by Steven Spielberg.” We’ll solve an example of a negative data question in this article. Before we do so, let's first review the tables involved. We have two tables: the employee table and the payment_history table. The second table has a column named payment_type that denotes whether a payment is part of a regular salary, a bonus, or an award. Let's see some sample data from the tables we will use: employee employee_id payment_type amount_paid payment_date 100 salary 2000.00 2018-Mar-02 101 salary 1800.00 2018-Mar-02 102 salary 1900.00 2018-Mar-02 101 award 500.00 2018-Mar-08 102 adjustment 124.70 2018-Mar-10 payment_history employee_id payment_type amount_paid payment_date 100 salary 2000.00 2018-Mar-02 101 salary 1800.00 2018-Mar-02 102 salary 1900.00 2018-Mar-02 101 award 500.00 2018-Mar-08 102 adjustment 124.70 2018-Mar-10 Here’s the negative data question we’d like to answer: “Obtain the names of employees who never received an award” The SQL query that answers the previous question is the following: SELECT last_name, first_name FROM employee e1 WHERE NOT EXISTS (SELECT ph.last_name FROM payment_history ph WHERE ph.employee_id = e1.employee_id AND ph.payment_type = 'award') SQL Correlated Subqueries Vs. Simple Subqueries The main difference between a SQL correlated subquery and a simple subquery is that a SQL correlated subquery references columns from the table of the outer query. In the above example, ph.employee_id = e1.employee_id is a reference to the outer subquery table (e1). To identify a correlated subquery, just look for these kinds of references. If you find at least one, you have a correlated subquery! The negative part of a data question is often solved in a SQL correlated subquery by using the NOT EXISTS operator in the WHERE clause. EXISTS is an operator always followed by a subquery. If the subquery returns at least one record, then EXISTS evaluates to TRUE. If the subquery returns an empty set, then EXISTS evaluates to FALSE. Note that we use NOT EXISTS, which is simply the opposite of EXISTS. The result of the previous query is: first_name last_name John Smith Alice Johnson Another Correlated Subquery Example In this example, we’ll try to obtain the names of all employees who earned higher salaries in March 2018 than their average monthly salaries for all previous months using a SQL correlated subquery. Here’s the query we’ll run: SELECT first_name, last_name FROM employee e1, payment_history ph WHERE e1.employee_id = ph.employee_id AND amount_paid > = ( SELECT AVG(amount_paid) FROM payment_history ph2 WHERE ph2.employee_id = e1.employee_id AND ph2.payment_date < '01/03/2018' AND ph2.payment_type = 'salary' ) AND month(ph.payment_date) =3 AND year(ph.payment_date) = 2018 AND ph.payment_type ='salary' Enough Negativity. What About Positive Data Questions? Must we use a SQL correlated subquery to answer a positive data question? No, you don’t have to. You can still do so if you want to, though. For positive questions, we can usually just use a JOIN condition or a relationship between two tables. Let's change our previous question to a positive one and solve it with a JOIN instead a correlated subquery. The question becomes: “Obtain the names of employees who received award payments.” And the SQL query (without correlated subqueries) that answers this question is: SELECT first_name, last_name FROM employee e1 JOIN payment_history ph ON ph.employee_id = e1.employee_id WHERE ph.payment_type =award' The result is: first_name last_name Kate Miller Warning: Your SQL Correlated Subquery Is Likely to Be Slow I’d just like to mention that we try not to overuse SQL correlated subqueries, if possible. Recall that a correlated subquery runs once for each record returned by the outer query. If the outer query returns thousands upon thousands of records, you can imagine how quickly your query will slow down in performance. In general, you should only use a SQL correlated subquery if it’s absolutely necessary. Try a SQL Correlated Subquery Yourself! We’ve seen that correlated subqueries are an important part of the SQL language and can help us answer different data questions, especially negative ones. We also explained how to recognize a SQL correlated subquery and why we should generally try to avoid correlated subqueries, if possible, due to performance reasons. To improve your subquery skills, try LearnSQL.com’s SQL Basics course. We have a specific section devoted to subqueries, with plenty of practical exercises and correlated subqueries examples to help you master the content. Tags: data analysis how to in sql SQL basics SQL subquery You may also like What Are the Different Types of SQL Subqueries? What are SQL subqueries? How many types are there, and what can they do? This article will guide you through SQL subquery basics. Read more Subquery vs. JOIN What are the differences between a subquery and a JOIN in SQL? How do you choose one over the other? See examples of when to use which. Read more Subquery vs. CTE: A SQL Primer SQL subqueries and CTEs seem similar, but even pros may not know all their quirks. Learn their similarities, differences, and best use cases. Read more Beginner’s Guide to the SQL Subquery Did you know that you can use a SQL query inside another SQL query? Find out the ABCs of subqueries and improve your SQL skills! Read more You Want to Learn SQL? You've Come to the Right Place! If you want to learn SQL basics or enhance your SQL skills, check out LearnSQL.com for a wide range of SQL courses and tracks. Read more Converting Subqueries to Joins No more badly performed queries! Find out how you can convert SQL subqueries to joins and improve your query efficiency. Read more Correlated Subquery in SQL: A Beginner’s Guide Correlated subqueries are the only way to solve some SQL statements. But they can be very slow. In this post, we’ll talk about why, how, and when to use them. Read more SQL Subqueries The article describes what a subquery is and what these useful statements look like. We will cover basic examples with the IN, EXISTS, ANY, and ALL operators, look at subqueries in FROM and WHERE clauses, and explore the difference between correlated and nested subqueries. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.