Back to articles list Articles Cookbook
Updated: 28th Aug 2023 8 minutes read

How to Use WHERE with GROUP BY in SQL

In this article, we’ll discuss how to combine the WHERE and GROUP BY clauses in SQL.

If you’re writing SQL on a daily basis, you will quickly realize how often both WHERE and GROUP BY clauses are used. WHERE is an essential part of most queries. It allows you to filter large datasets to only the pieces you are interested in. GROUP BY is one of the most powerful tools an analyst has when aggregating data. By the end of this article, you will understand how to effectively use both of these clauses while also avoiding common pitfalls.

Upgrade your SQL skills with our SQL Practice track! This track consists of 9 SQL practice courses (and we keep adding more!) and offers over 900 hands-on practice exercises. Each course is interactive, and the exercises are based on real-world scenarios. Don't miss this opportunity to practice SQL in a structured and interactive way.

When it comes to writing SQL, consider each clause (SELECT, WHERE, GROUP BY, etc.) as a separate tool. As you learn more SQL syntax, you add that tool to your tool kit. As you can imagine, a mechanic with only a few tools will not be very effective. I would not recommend you take your car to someone with only a handful of tools. It’s the same with data analytics. The more tools you have at your disposal, the more quickly and efficiently you can analyze different data sets.

Let’s get started by taking an in-depth look at WHERE and GROUP BY separately. After we’ve established this foundational knowledge, we’ll combine the two and unleash their full power.

The WHERE Clause

As stated previously, the WHERE clause is used to filter a dataset and return only the records that match some specified criteria. Consider the following dataset, which includes country populations for 2022.

For illustration purposes, we’ll say this data is housed in a table called world_populations and only includes the top 10 most populous countries. We’ll take a look at this data by running:

SELECT *
FROM world_populations;

Here is the result:

countrypopulationyearly_changenet_changemed_ageurban_pop_pctworld_share
China1,439,323,7760.00395,540,09038.61.1847
India1,380,004,3850.009913,586,63128.35.1770
United States331,002,6510.00591,937,73438.83.0425
Indonesia273,523,615.01072,898,04730.56.0351
Pakistan220,892,340.024,327,02223.35.0283
Brazil212,559,41700721,509,89033.88.0273
Nigeria206,139,589.02585,175,99018.52.0264
Bangladesh164,689,383.01011,643,22228.39.0211
Russia145,934,462.000462,20640.74.0187
Mexico128,932,753.01061,357,22429.84.0165

OK, we’ve got our data. Now let’s add some filters.

Example #1

In our first example, we only want to see countries where the population is greater than 200 million. We’ll accomplish this by running the following query:

SELECT *
FROM world_populations
WHERE population > 200000000;

And the result:

countrypopulationyearly_changenet_changemed_ageurban_pop_pctworld_share
China14393237760.0039554009038.61.1847
India13800043850.00991358663128.35.1770
United States3310026510.0059193773438.83.0425
Indonesia273523615.0107289804730.56.0351
Pakistan220892340.02432702223.35.0283
Brazil2125594170.72%15098903388%2.73%
Nigeria2061395892.58%51759901852%2.64%

We can see that the WHERE clause filtered out Bangladesh, Russia, and Mexico because their populations fall below the threshold established in our query.

Example #2

The WHERE clause can also support multiple filters. Let’s only list countries that saw a positive yearly change in the number of migrants and where at least 80% of the population lived in urban areas. Note that we’ve used AND in the WHERE clause to connect these two conditions:

SELECT *
FROM world_populations
WHERE migrants > 0
  AND urban_pop_pct > .80;

This query returns:

countrypopulationyearly_changenet_changemed_ageurban_pop_pctworld_share
United States3310026510.0059193773438.83.0425
Brazil2125594170.0072150989033.88.0273

There is no limit to the amount of filtering you can do in a WHERE clause. For more details, check out our Complete Guide to the SQL WHERE Clause

The GROUP BY Clause

To understand GROUP BY, we’ll use a different dataset called transactions. It contains transactional data for an online retailer:

DescriptionProductCategoryQuantityUnitPriceCustomerID
KNITTED UNION FLAG HOT WATER BOTTLEKitchen63.3917850
POPPY'S PLAYHOUSE BEDROOMToys62.117850
IVORY KNITTED MUG COSYKitchen61.6513047
BOX OF VINTAGE JIGSAW BLOCKSToys34.9513047
RED COAT RACK PARIS FASHIONClothing34.9513047
YELLOW COAT RACK PARIS FASHIONClothing34.9513047
BLUE COAT RACK PARIS FASHIONClothing34.9513047

Example #1

The GROUP BY operator is used to aggregate data by grouping records that share the same value in a specified field. To answer the question, “How many items were sold in each product category?” we would run the following query:

SELECT 
  productCategory,
  SUM(quantity) as quantity
FROM transactions 
GROUP BY productCategory;

Here are the results:

ProductCategoryQuantity
Kitchen12
Toys9
Clothing9

There are two products in the kitchen product category: KNITTED UNION FLAG HOT WATER BOTTLE and IVORY KNITTED MUG COSY. Each of these products has a quantity of 6; therefore, the total sum for the kitchen product category is 12.

There are two products in the toy product category. POPPY'S PLAYHOUSE BEDROOM has a quantity of 6 and BOX OF VINTAGE JIGSAW BLOCKS has a quantity of 3, for a total sum of 9.

Finally, there are three products in the clothing category, each with a quantity of 3. This brings the total sum of the clothing category to 9.

We will not go through them all, but there are many types of aggregations available to the data analyst. These include SUM(), AVG(), COUNT(), MEDIAN(), MIN(), and MAX(). More information can be found in the article How to Use Aggregate Functions in the WHERE Clause.

Example #2

What if we want to aggregate quantities for each product category and customer? We can GROUP BY multiple columns:

SELECT 
  customerId,
  productCategory,
  SUM(quantity) as quantity
FROM transactions 
GROUP BY customerId, productCategory;

And this is the result:

CustomerIDProductCategoryQuantity
17850Kitchen6
17850Toys6
13047Kitchen6
13047Toys3
13047Clothing9

We can see that there’s a row for each combination of customer and product category; now we know how many items of each category every customer has purchased.

For more information about GROUP BY, check out Using GROUP BY in SQL and GROUP BY in SQL Explained.

Using WHERE and GROUP BY Together

Now that we’ve laid the foundation, let’s combine WHERE and GROUP BY together. It’s important to remember that the WHERE clause is going to filter the dataset before the GROUP BY clause is evaluated. Also, the WHERE clause will always come before GROUP BY. If  you put it after, the query will return an error.

Example #1

Consider the same dataset and think about how we would answer the question “What is the average unit price of kitchen products and toys?”

First we’ll need to filter for kitchen products and toys. Next, we’ll calculate the average unit price. Here’s the query:

SELECT
  productCategory,
  AVG(UnitPrice) as AvgUnitPrice
FROM transactions 
WHERE productCategory in ('Kitchen', 'Toys')
GROUP BY productCategory;

And the result is:

ProductCategoryAvgUnitPrice
Kitchen2.52
Toys3.05

The average unit price is computed only for kitchen and toy products. All other categories are filtered out of the query.

Example #2

Now let’s consider how we would get the total quantity of products by category where the average unit price is greater than 3. To do this, we would run: 

SELECT
  productCategory,
  SUM(quantity) as quantity
FROM transactions 
WHERE unitPrice > 3
GROUP BY productCategory;

In this example, the aggregates are computed on the filtered rows. We get the following result:

ProductCategoryQuantity
Kitchen6
Toys3
Clothing9

Since only the products KNITTED UNION FLAG HOT WATER BOTTLE, BOX OF VINTAGE JIGSAW BLOCKS, RED COAT RACK PARIS FASHION, YELLOW COAT RACK PARIS FASHION, and BLUE COAT RACK PARIS FASHION have unit prices greater than 3, those are the only products included in the aggregation.

If we were to exclude the WHERE clause, we would get the following results:

ProductCategoryQuantity
Kitchen12
Toys9
Clothing9

Here we see the quantities of all products, regardless of the unit price, are totaled. The total quantity is different for all products vs. for products with a unit price over 3.

Example #3

What if we want to filter by an aggregated column? This is the job of the HAVING clause. You cannot put an aggregation in the WHERE clause. The HAVING clause is used instead of WHERE when filtering based on aggregate functions. We can illustrate this with another example. Let’s continue with the above example and filter the results by product categories where the average unit price is greater than 3. To accomplish this, we’d write:

SELECT
  productCategory 
  AVG(UnitPrice) as AvgUnitPrice
FROM transactions 
WHERE productCategory in (‘Kitchen’, ‘Toys’)
GROUP BY productCategory
HAVING AVG(UnitPrice) > 3;

Result:

ProductCategoryAvgUnitPrice
Toys3.05

The first thing that is happening is that the WHERE clause limits the data to just the kitchen and product categories. The GROUP BY then aggregates the average unit price for each category. Finally, the HAVING clause further filters the results to include only product categories with an average unit price greater than 3.

If we were to remove the HAVING clause, we would see that both the kitchen and toy categories would be seen in the result. However, since the average unit price of kitchen items is less than 3, it is filtered out when we add the HAVING clause.

For more examples, read our article on HAVING vs. WHERE in SQL.

WHERE and GROUP BY - Where to Go From Here

Great work! You’ve gained the foundational knowledge needed to combine WHERE and GROUP BY clauses in SQL. You’ve been introduced to the difference between WHERE and HAVING. More importantly, you can combine all three of these clauses together to filter and aggregate data to fit your needs.

You’ve added two more tools to your toolkit and are ready to use them. A great way to reinforce these learnings is by taking our interactive SQL Basics course. Practice makes perfect!

Also, be sure to check out all of our other great articles.