6th Feb 2020 5 minutes read HAVING vs. WHERE in SQL: What You Should Know Ignacio L. Bisso GROUP BY Table of Contents Sample Dataset WHERE and HAVING: Simple Examples COMPLEX HAVING CLAUSES Using WHERE and HAVING in the Same SQL Query WHERE and HAVING in Complex Queries Solved: WHERE vs. HAVING in SQL This article is about SQL’s WHERE and HAVING clauses. Both clauses are part of the foundations of the SQL SELECT command. They have similar uses, but there are also important differences that every person who uses SQL should know. Let’s see what’s behind the HAVING vs. WHERE debate. In this article, we’ll suppose we work for a government social agency supporting people or families whose income is below a certain threshold. This agency uses several metrics to identify people or families needing help. First, let’s look at the sample dataset. Then, in the rest of the article, we’ll create queries based on this dataset. Sample Dataset This dataset describes persons belonging to four families that live in two cities. For simplicity’s sake, we will assume that last_name identifies the family. The idea is to obtain metrics at the person level and at the family level. TABLE Persons namelast_namebirth_dateyear_incomecity MaryRoberts1964-01-1178000Oklahoma PeterRoberts1962-09-2586500Oklahoma JohnRoberts1999-06-030Oklahoma SueRoberts1996-03-060Oklahoma MelindaRoberts1998-04-040Oklahoma GeorgeHudson1953-02-2348000Oklahoma NancyHudson1958-12-0665000Oklahoma AnnHudson1979-04-0235000Oklahoma CarlGibson1963-04-03102800Phoenix LiGibson1963-12-2796000Phoenix KateBishop1994-07-10920000Phoenix MarkBishop2018--9-130Phoenix Okay. Now that we’ve seen the dataset, let’s get started! WHERE and HAVING: Simple Examples In simple words, the WHERE and HAVING clauses act as filters; they remove records or data that don’t meet certain criteria from the final result of a query. However, they are applied to different sets of data. That’s the important point to understand about WHERE vs. HAVING: WHERE filters at the record level, while HAVING filters at the "group of records" level. Let’s see some examples. Here is a sample query that uses the WHERE clause: Suppose we want to obtain the names of people with an annual income greater than $100,000. We need to filter (or discard) at the record level, so we will use the WHERE clause instead of the HAVING clause for this query: Query text for copy-paste purposes: SELECT name, last_name FROM persons WHERE year_income > 100000; Now let’s try a similar query, but this time with the HAVING clause: Suppose we want to obtain the last_name of families having a household income (i.e. the summed income of all family members) over $100,000. This is a clear case for using the HAVING clause, as we don’t need to filter by record. (We are not going to discard a person’s record because they make less than $100,000.) The idea is to filter based on family income, so we need to group persons by last_name and use HAVING to filter the groups of persons, as shown below: Query text for copy-paste purposes: SELECT last_name, SUM(year_income) AS "family_income" FROM persons GROUP BY last_name HAVING SUM(year_income) > 100000; COMPLEX HAVING CLAUSES We can use as many aggregate functions as we wish in the condition of the HAVING clause. Let’s go further with our analysis of family incomes and calculate the average income per member for each family. We want to identify families that make less than 50,000 per person. From an economic point of view, this analysis can show more about family incomes than the previous one. Here’s the query: Query text for copy-paste purposes: SELECT last_name, COUNT(*) as "members", SUM(year_income) as "family_income", SUM(year_income) / COUNT(*) as "per_member_income" FROM persons GROUP BY last_name HAVING SUM(year_income) / COUNT(*) < 50000; Note: The HAVING clause has some restrictions; one of them is that record-level columns in the HAVING condition must also appear in the GROUP BY clause. Using WHERE and HAVING in the Same SQL Query It is very common to use WHERE and HAVING in the same query. Let’s do a query to obtain the total family income and the income per member for families in Oklahoma that have more than four members: Query text for copy-paste purposes: SELECT last_name, COUNT(*) as "members", SUM(year_income) as "family_income", SUM(year_income) / COUNT(*) as "per_member_income" FROM persons WHERE city = ‘Oklahoma’ GROUP BY last_name HAVING COUNT(*) > 4; WHERE and HAVING in Complex Queries To close the article, we’ll build a query that returns families with a household income lower than the average income in their cities. The tricky part is the subquery that obtains the average income for a given city. Please note that we use a different query because this is the average income per city; it’s not based on the number of members per family, but on the number of families in that city, as calculated with count(distinct last_name). Query text for copy-paste purposes: SELECT last_name, COUNT(*) AS "members", SUM(year_income) AS "family_income", SUM(year_income) / COUNT(*) AS "per_member_income" FROM persons p GROUP BY last_name HAVING SUM(year_income) < ( SELECT SUM(year_income) / COUNT(distinct last_name) FROM persons WHERE city = p.city ); Solved: WHERE vs. HAVING in SQL We’ve reviewed several different queries that use SQL’s WHERE and HAVING clauses. As we mentioned, both clauses work as filters, but each applies to a different data level. The WHERE clause filters at the record level, while the HAVING clause filters at the group level. SQL is a super flexible language, and you can create millions of combinations using the WHERE and HAVING clauses. At this point, I would like to mention a great course related to the topics in this article: LearnSQL's Creating Basic SQL Reports in SQL. It benefits people who know SQL basics and would like to improve their skills in creating using meaningful SQL reports. Go further – explore SQL! Tags: GROUP BY