15th Sep 2020 6 minutes read

GROUP BY in SQL Explained

GROUP BY

Table of Contents

How Does SQL GROUP BY Work?
GROUP BY in Action
Using Aggregate Functions with GROUP BY
GROUP BY Two Columns
Ready to Use GROUP BY in SQL?

The SQL GROUP BY statement is easy to use, but it can be hard to master. Learn what you can do with GROUP BY, how it works, and how to use it with aggregate functions.

GROUP BY is one of the basic SQL tools. However, it might be difficult for beginners to understand how GROUP BY works and how it is used to compute statistics with different aggregate functions. In this article, I’ll use several examples to show exactly how GROUP BY organizes rows and how aggregate functions can be used to calculate statistics on one or more columns.

So, let’s get started.

How Does SQL GROUP BY Work?

GROUP BY is an indispensable tool for any data analyst working with SQL. If you want to organize your data in groups and calculate some kind of aggregate statistics for these groups, the GROUP BY clause is what you need.

So, what does GROUP BY do? Basically, it groups rows with the same value into one group or bucket. Let’s see how it works.

Suppose we’re running a bookstore and want to know how many books of different genres we have in stock. Our database includes a table that lists the books’ titles, genres, and stock quantity.

The visualization below shows how the GROUP BY clause creates groups from table data. We want to know the total quantity of books for each genre; thus, GROUP BY groups the books of the same genre and sums up the corresponding quantities. This creates a result table that lists genres and their total quantity of books in our stock.

For a more detailed introduction to the GROUP BY topic, see this video tutorial from our “We Learn SQL” series.

GROUP BY in Action

Now it’s time for more specific examples of SQL queries with a GROUP BY clause. We’ll use the books table, which stores the ID, title, author, genre, language, price, and quantity of each novel we stock.

id	title	author	genre	lang	price	qty
1	Les Trois Mousquetaires	Alexandre Dumas	adventure	fr	11.90	4
2	A Game of Thrones	George R.R. Martin	fantasy	en	8.49	5
3	Pride and Prejudice	Jane Austen	romance	en	9.99	2
4	Vampire Academy	Richelle Mead	fantasy	en	7.99	3
5	Ivanhoe	Walter Scott	adventure	en	9.99	3
6	Armance	Stendhal	romance	fr	5.88	1

To calculate the total number of books of each genre, we’ll use the following query:

SELECT genre, SUM(qty) AS total
FROM books
GROUP BY genre;

Here, in the GROUP BY clause, we select our rows to be grouped by the column genre. Then the SUM(qty) function in the SELECT statement sums the qty values within each group (i.e. each book genre), with the result displayed in the corresponding total field:

genre	total
adventure	7
fantasy	8
romance	3

Using Aggregate Functions with GROUP BY

GROUP BY puts rows with the same value into one bucket. We usually want to compute some statistics for this group of rows, like the average value or the total quantity. To this end, SQL provides aggregate functions that combine values from a certain column into one value for the respective group.

So far, we’ve only used SUM() as our aggregate function for grouping the book titles in stock. However, this is not the only aggregate function you can use with GROUP BY. SQL also offers:

COUNT() to calculate the number of rows in each group.
AVG() to find the average value for each group.
MIN() to return the minimum value in each group.
MAX() to return the maximum value in each group.

Let’s see how the AVG() function works with GROUP BY. This time, we want to calculate the average price for books in each genre. We’ll start by visualizing the output we want to get.

We’re again grouping our books by genre, but this time we want to calculate the average book price in each genre. The SQL query for this looks as follows:

SELECT genre, AVG(price) AS avg_price
FROM books
GROUP BY genre;

This query creates a table with two columns (genre and avg_price), where the average price is calculated by averaging the price values for the books of each genre:

genre	avg_price
adventure	10.945
fantasy	8.24
romance	7.935

Actually, we are not limited to using only one aggregate function with a GROUP BY clause. So, let’s add the information about the minimum and maximum price of the books in each genre:

SELECT genre, 
MIN(price) AS min_price, 
     AVG(price) AS avg_price, 
MAX(price) AS max_price
FROM books
GROUP BY genre;

The result set now includes four columns: genre, min_price, avg_price, and max_price.

genre	min_price	avg_price	max_price
adventure	9.99	10.945	11.90
fantasy	7.99	8.24	8.49
romance	5.88	7.935	9.99

Note that when using GROUP BY the SELECT statement may only include fields that are either used in the aggregate function or listed in the GROUP BY clause. For example, in our case, we cannot add title or author to our result set. To do so makes no sense, as each row in our output table includes information on several books with different titles and authors.

GROUP BY Two Columns

In SQL, you can also group your data using several columns. For example, let’s say we want to group our books not only by genre but also by language. And we want to calculate the number of book titles in each category.

Hint: The COUNT() aggregate function will assist us in calculating the number of book titles (i.e. rows).

To compute the number of book titles by genre and language, we’ll use the following query:

SELECT genre, lang, count(title) AS titles
FROM books
GROUP BY genre, lang;

We’ll get an output table with three columns: genre, language (lang), and the number of titles in that category:

genre	lang	titles
adventure	en	1
adventure	fr	1
fantasy	en	2
romance	en	1
romance	fr	1

So, in our example dataset, we have:

One English-language book title in the adventure genre (Ivanhoe).
One French-language book title in the adventure genre (Les Trois Mousquetaires).
Two English-language book titles in the fantasy genre (A Game of Thrones and Vampire Academy).
One English-language book title in the romance genre (Pride and Prejudice).
One French-language book title in the romance genre (Armance).

As you see, there are no French-language fantasy books in our dataset.

For more examples of GROUP BY applications, check out this article showing how GROUP BY can be leveraged in realistic business scenarios.

Ready to Use GROUP BY in SQL?

Now you have a better understanding of how GROUP BY can assist you with organizing and analyzing your data. However, if you want to handle GROUP BY like a pro, you need lots of practice.

LearnSQL.com offers an SQL Basics course that includes 129 interactive exercises covering aggregation, grouping, JOINs, subqueries, and more. This course is a perfect opportunity to learn how to group rows and compute statistics with aggregate functions.

Happy learning!

Tags:

GROUP BY

How Does SQL GROUP BY Work?

GROUP BY in Action

Using Aggregate Functions with GROUP BY

GROUP BY Two Columns

Ready to Use GROUP BY in SQL?

You may also like