Back to articles list Articles Cookbook
10 minutes read

How to Use COUNT() with GROUP BY: 5 Practical Examples

Using the COUNT() function with GROUP BY is one of the most common SQL constructs in aggregate queries. Read this article to find out how to use COUNT() with GROUP BY correctly using 5 examples.

In this article, we will explain the importance of using COUNT with GROUP BY. We’ll talk about why it is essential in SQL and how it enables data analysis and summarization based on specific criteria. This combination empowers users to extract meaningful insights, calculate counts, and generate statistical summaries from large datasets.

If you’re looking for an in-depth review of basic SQL concepts like COUNT() and GROUP BY, I recommend our interactive SQL Basics course. It contains 129 exercises, which will help you review all key SQL concepts.

How to Use COUNT() and GROUP BY

If you’re here just for the quick answer, here’s the TLDR:

The correct way of using COUNT() with GROUP BY is shown in the query below:

SELECT 
  Store,
  COUNT(*) as NumberOfSales, 
FROM Sales
GROUP BY Store;

Let's break down the result to understand how this query works. The rows with the same value in the Store column are grouped together. Imagine an intermediate table where these rows are grouped and marked with different colors, like the image below. This would be our intermediary table containing only the Store column, since that is the column that is part of our SELECT statement.

StoreProductIDCustomerID
Store A1657
Store A11116
Store A14525
Store B1369
Store B11138
Store C13616
Store C1118

The database then logically counts the number of rows in each group using the COUNT(*) function. This count represents the number of orders for each store.

Once the rows are counted, there is no need to have duplicate rows with the same Store value. So, GROUP BY will reduce the number of rows to only unique values. In other words, it will get rid of the individual rows and give us a summary of each group. We end up with the following result:

StoreNumberOfSales
Store A3
Store B2
Store C2

When to Use GROUP BY

GROUP BY is an SQL clause that groups rows based on one or more column values. It is often used in combination with aggregate functions like COUNT(), SUM(), AVG(), MAX(), and MIN() to perform calculations on grouped data.

The GROUP BY clause is useful when you want to:

  • Make calculations and aggregations on subsets of data.
  • Generate summary statistics and metrics for different groups or categories.
  • Identify patterns and trends within specific groups.
  • Generate reports and analyze data based on different dimensions or attributes.
  • Apply filters and conditions on grouped data, using the HAVING clause

In summary, GROUP BY is used to organize and summarize data based on specific columns, functions or expressions, which will allow you to gain insights and perform calculations on distinct groups within a dataset.

When to Use the COUNT() Function

COUNT() is one of SQL’s most common aggregate functions. It returns the number of rows that match a specified condition or are included in a result set. It is often used to retrieve the total number of records in a table or to calculate the number of occurrences of a particular value within a column.

5 Examples of Using COUNT() with GROUP BY

Now that we’ve gone through the basic scenarios where COUNT() and GROUP BY are used, let’s go over some of the more complicated examples. We’ll start off with simpler examples and work our way up to more complex scenarios.

Example #1: GROUP BY a Single Column

The simplest scenario you could encounter is when you need to GROUP BY a single column. In the following example, we need to find out how our company’s employee count is distributed across different job titles.

Before seeing the SQL solution to this scenario, let’s go over the sample data. This is the employees table:

EmployeeIDFirstNameLastNameJobTitle
1JohnDoeManager
2JaneSmithSupervisor
3MarkJohnsonDeveloper
4EmilyWilliamsAnalyst
5MichaelBrownDesigner
6SarahDavisDeveloper
7RobertWilsonDesigner
8JessicaTaylorDeveloper

By using the COUNT function with GROUP BY on the JobTitle column, we can get a breakdown of the number of employees in each specific role. You can see the query and the output (based on the sample data) below:

SELECT 
  JobTitle, 
  COUNT(*) AS NumberOfEmployees
FROM employees
GROUP BY JobTitle;
JobTitleNumberOfEmployees
Analyst1
Designer2
Developer3
Manager1
Supervisor1

This example works the same way as our initial query. GROUP BY puts the rows for employees with the same job title into one group. Then the COUNT() function counts the rows in each group. GROUP BY then collapses the rows in each group, keeping only the value of the column JobTitle and the count.

Example #2:  GROUP BY Multiple Columns

Of course, you can group rows by more than one column.

In this example, we will look at a sample orders table containing basic order information:

OrderIDCustomerIDProductIDProductCategoryOrderDateStatusAmount
11011001Electronics2023-05-01Completed150.00
21021002Clothing2023-05-02Completed80.00
31011001Home Goods2023-06-03In progress60.00
4103NULLAccessories2023-06-03Canceled200.00
51011002Electronics2023-07-04NULL120.00
61021001NULL2023-07-05NULLNULL
71031002Clothing2023-07-06In progress90.00
81021002Accessories2023-08-07NULL75.00
9103NULLNULL2023-08-08NULL100.00
101011001Home Goods2023-09-09NULLNULL
111021001Home Goods2023-06-05In progress80.00
121031004Accessories2023-06-06Completed75.00
131021005Electronics2023-08-06Completed88.00

We need to write a query that will show the number of orders placed by each customer and the ProductCategory of that order. This means that we will have to return the CustomerID and the category that the order falls within.

The query will look like this:

SELECT
  CustomerID,
  ProductCategory,
  COUNT(*) AS NumberOfOrders
FROM orders
GROUP BY CustomerID, ProductCategory;

And the result of running this query can be seen below:

CustomerIDProductCategoryNumberOfOrders
101Electronics2
101Home Goods2
102Electronics2
102Accessories2
102Clothing1
103Accessories2
103Clothing1
103NULL1

Our query groups rows by two columns: CustomerID and ProductCategory. This means that GROUP BY groups rows with the same values of CustomerID and ProductCategory into one group. (Rows for CustomerID 101 and the category Electronics are in one group, but rows for CustomerID 101 and the category Home Goods are in a different group.) Then the rows in each group are counted by COUNT().

Example #3: Using WHERE with COUNT() and GROUP BY

Our previous example analyzed a scenario where we wanted to create an aggregate of all of the information in our orders table. But sometimes, we might want to slice into this information and only see orders for selected categories.

Using the same sample data that we had before, we will now write a query that shows the same information for orders that fall within the “Accessories” or “Clothing” ProductCategory.

To do this, we can use the query from Example 2 and just add a WHERE clause. This clause will filter for records where ProductCategory is equal to “Accessories” or “Clothing”.

SELECT
  CustomerID,
  ProductCategory,
  COUNT(*) AS NumberOfOrders
FROM orders
WHERE ProductCategory IN (‘Accessories’, ‘Clothing’)
GROUP BY CustomerID, ProductCategory;
CustomerIDProductCategoryNumberOfOrders
102Accessories2
103Accessories1
102Clothing1
103Clothing1

If it’s not entirely intuitive how the database has generated the results using the query above, here’s a step-by-step explanation of what happened behind the scenes:

  1. First, the database scans the orders table and reads all the rows.
  2. It then applies the filtering condition in WHERE ProductCategory IN (‘Accessories’, ‘Clothing’ ) to filter the rows. After this step, only the rows where the product category is “Accessories” or “Clothing” are considered for further processing.
  3. The filtered rows are then grouped based on the values in the CustomerID and ProductCategory columns, which are specified in the GROUP BY clause.
  4. For each of the unique combinations of CustomerID and ProductCategory, the COUNT(*) function is applied. This will count the number of rows within each group.
  5. The final result set includes the CustomerID, ProductCategory and the count of orders (shown in the NumberOfOrders column) for each group.

In summary, the database will filter the rows that respect the specified filtering condition. It will then group them according to the specified columns in the GROUP BY clause, and then calculate the count of orders within each of those groups. The final output will include the CustomerID, ProductCategory and the corresponding count of orders for each unique combination of CustomerID and ProductCategory.

Example #4: Using ORDER BY with COUNT() and GROUP BY

Looking at the result from example number 4, we can see that some of the rows in the output are mixed. This is because the columns in the GROUP BY list will generally require – but not guarantee – sorting the output based on the list of columns in the GROUP BY section.

But if, for example, we need to organize the results on different conditions ( e.g. a timeline or by date) we will need to use an ORDER BY clause:

SELECT
  CustomerID,
  ProductCategory,
  COUNT(*) AS NumberOfOrders
FROM orders
WHERE ProductCategory IN (‘Accessories’, ‘Electronics’)
GROUP BY CustomerID, ProductCategory
ORDER BY ProductCategory, CustomerID;
CustomerIDProductCategoryNumberOfOrdes
101Accessories2
102Accessories2
102Electronics1
103Electronics2

We’ve added the ORDER BY clause with the list of columns we want to sort the data by. As you can see, this sorts the output information based on the order of the columns listed.

Example #5: COUNT(expression)

So far, we’ve looked at simple examples of working with COUNT(); the purpose was to count all of the rows in the source dataset or table.

However, there are more complex ways of using the COUNT–GROUP BY combination. To explain this, we’ll build a new sample dataset.

We have a table called SurveyResponses that stores answers to a survey. Some of the questions are optional, which is why there are some respondents with NULL answer values; they’ve skipped the optional questions. You can see the sample data below:

ResponseIDRespondentIDRespondentNameQuestionIDAnswer
1101John1Agree
2101John2No
3101John3Yes
4102Sarah1Yes
5102Sarah2Not Sure
6102Sarah3NULL
7103Mark1No
8103Mark2Maybe
9103Mark3No
10104Emily1Yes
11104Emily2Not Sure
12104Emily3Disagree

COUNT(*) - Counting Rows

The most common usage of the COUNT function (and its default functionality, even if it’s used with GROUP BY) is to count the number of rows. For example, if we want to count the answer types of survey questions, we can use the following query:

SELECT COUNT(*) AS NumberOfYesAnswers, Answer
FROM SurveyResponses
GROUP BY Answer;

The output will look like this:

NumberOfAnswersAnswer
3Yes
3No
2Not Sure
1Agree
1Disagree
1Maybe
1NULL

The outcome of this is a count of all rows that match a certain answer. It returns the aggregate number of survey answers and the type of the answer.

Counting NON-NULL Values in a Column

Let’s look at another example that might seem to generate the same results. Actually, this example has something particular and important about it; instead of using * as a parameter in our COUNT() function, we’re using COUNT() with a column name.

We will modify the query in the previous example. Instead of the * parameter for the COUNT() function, we will replace it with the Answer column. Our new query looks like this:

SELECT 
  COUNT(Answer) AS NumberOfAnswers, 
  AnswerFROM SurveyResponses
GROUP BY Answer;

If we run this new query, we will see that the results returned are almost exactly the same:

NumberOfAnswersAnswer
3Yes
3No
2Not Sure
1Agree
1Disagree
1Maybe
0NULL

We can see that the output for the NULL answer value has changed from 1 to 0. This is because the COUNT() function only takes into consideration non-NULL values when doing the aggregation.

Before we were doing COUNT(*), which implicitly means count the rows; COUNT(Answer) will count the values in the Answer column. And because we had 1 value with NULL, it will skip those values in its calculation – returning 0 in this second scenario.

Counting Distinct NON-NULL Values in a Column

In this third variation of using the COUNT function, we will use the same query as in our previous example. This time,  though, we will add the DISTINCT keyword before the column name.

SELECT 
  Answer, 
  COUNT(DISTINCT Answer) AS DistinctCount
FROM SurveyResponses
GROUP BY Answer;
StatusDistinctCount
Yes1
No1
Not Sure1
Agree1
Disagree1
Maybe1
NULL0

We can see in the output above that the result of this query has turned all of the positive values in the DistinctCount column to 1. Adding the DISTINCT keyword means that each time the COUNT function finds a new value it hasn’t seen before, it will add that status to its list and add a 1  to its count. However, if it finds the same status value a second time or more, it will skip counting it.

Want to Know More About COUNT() and GROUP BY?

Using COUNT() with GROUP BY is only one of the many powerful features that SQL offers. If you’re eager to explore SQL further and truly master its capabilities, I encourage you to check out our comprehensive SQL Basics course. If you’re looking for SQL practice, I recommend our SQL Practice track, which has over 600 SQL practice exercises.

After taking our courses, maybe you feel like you want to find a job working with SQL. To help you prepare for your interview, we have a list of SQL interview questions that will show you some real-world examples of questions and problems in SQL.