Back to articles list September 15, 2020 - 6 minutes read GROUP BY in SQL Explained Kateryna Koidan Kateryna is a data science writer from Kyiv, Ukraine. She worked for BNP Paribas, the leading European banking group, as an internal auditor for more than 6 years. More recently, she decided to pursue only the favorite part of her job—data analysis. Now she is continuing her self-education with deep-learning courses, enjoys coding for data analysis and visualization projects, and writes on the topics of data science and artificial intelligence. Kateryna is also a proud mother of two lovely toddlers, who make her life full of fun. Tags: sql learn sql GROUP BY ORDER BY The SQL GROUP BY statement is easy to use, but it can be hard to master. Learn what you can do with GROUP BY, how it works, and how to use it with aggregate functions. GROUP BY is one of the basic SQL tools. However, it might be difficult for beginners to understand how GROUP BY works and how it is used to compute statistics with different aggregate functions. In this article, I’ll use several examples to show exactly how GROUP BY organizes rows and how aggregate functions can be used to calculate statistics on one or more columns. So, let’s get started. How Does SQL GROUP BY Work? GROUP BY is an indispensable tool for any data analyst working with SQL. If you want to organize your data in groups and calculate some kind of aggregate statistics for these groups, the GROUP BY clause is what you need. So, what does GROUP BY do? Basically, it groups rows with the same value into one group or bucket. Let’s see how it works. Suppose we’re running a bookstore and want to know how many books of different genres we have in stock. Our database includes a table that lists the books’ titles, genres, and stock quantity. The visualization below shows how the GROUP BY clause creates groups from table data. We want to know the total quantity of books for each genre; thus, GROUP BY groups the books of the same genre and sums up the corresponding quantities. This creates a result table that lists genres and their total quantity of books in our stock. For a more detailed introduction to the GROUP BY topic, see this video tutorial from our “We Learn SQL” series. GROUP BY in Action Now it’s time for more specific examples of SQL queries with a GROUP BY clause. We’ll use the books table, which stores the ID, title, author, genre, language, price, and quantity of each novel we stock. idtitleauthorgenrelangpriceqty 1Les Trois MousquetairesAlexandre Dumasadventurefr11.904 2A Game of ThronesGeorge R.R. Martinfantasyen8.495 3Pride and PrejudiceJane Austenromanceen9.992 4Vampire AcademyRichelle Meadfantasyen7.993 5IvanhoeWalter Scottadventureen9.993 6ArmanceStendhalromancefr5.881 To calculate the total number of books of each genre, we’ll use the following query: SELECT genre, SUM(qty) AS total FROM books GROUP BY genre; Here, in the GROUP BY clause, we select our rows to be grouped by the column genre. Then the SUM(qty) function in the SELECT statement sums the qty values within each group (i.e. each book genre), with the result displayed in the corresponding total field: genretotal adventure7 fantasy8 romance3 Using Aggregate Functions with GROUP BY GROUP BY puts rows with the same value into one bucket. We usually want to compute some statistics for this group of rows, like the average value or the total quantity. To this end, SQL provides aggregate functions that combine values from a certain column into one value for the respective group. So far, we’ve only used SUM() as our aggregate function for grouping the book titles in stock. However, this is not the only aggregate function you can use with GROUP BY. SQL also offers: COUNT() to calculate the number of rows in each group. AVG() to find the average value for each group. MIN() to return the minimum value in each group. MAX() to return the maximum value in each group. Let’s see how the AVG() function works with GROUP BY. This time, we want to calculate the average price for books in each genre. We’ll start by visualizing the output we want to get. We’re again grouping our books by genre, but this time we want to calculate the average book price in each genre. The SQL query for this looks as follows: SELECT genre, AVG(price) AS avg_price FROM books GROUP BY genre; This query creates a table with two columns (genre and avg_price), where the average price is calculated by averaging the price values for the books of each genre: genreavg_price adventure10.945 fantasy8.24 romance7.935 Actually, we are not limited to using only one aggregate function with a GROUP BY clause. So, let’s add the information about the minimum and maximum price of the books in each genre: SELECT genre, MIN(price) AS min_price, AVG(price) AS avg_price, MAX(price) AS max_price FROM books GROUP BY genre; The result set now includes four columns: genre, min_price, avg_price, and max_price. genremin_priceavg_pricemax_price adventure9.9910.94511.90 fantasy7.998.248.49 romance5.887.9359.99 Note that when using GROUP BY the SELECT statement may only include fields that are either used in the aggregate function or listed in the GROUP BY clause. For example, in our case, we cannot add title or author to our result set. To do so makes no sense, as each row in our output table includes information on several books with different titles and authors. To learn about other problems you may encounter when using GROUP BY in SQL, check out this article that discusses common GROUP BY errors and how to avoid them. GROUP BY Two Columns In SQL, you can also group your data using several columns. For example, let’s say we want to group our books not only by genre but also by language. And we want to calculate the number of book titles in each category. Hint: The COUNT() aggregate function will assist us in calculating the number of book titles (i.e. rows). To compute the number of book titles by genre and language, we’ll use the following query: SELECT genre, lang, count(title) AS titles FROM books GROUP BY genre, lang; We’ll get an output table with three columns: genre, language (lang), and the number of titles in that category: genrelangtitles adventureen1 adventurefr1 fantasyen2 romanceen1 romancefr1 So, in our example dataset, we have: One English-language book title in the adventure genre (Ivanhoe). One French-language book title in the adventure genre (Les Trois Mousquetaires). Two English-language book titles in the fantasy genre (A Game of Thrones and Vampire Academy). One English-language book title in the romance genre (Pride and Prejudice). One French-language book title in the romance genre (Armance). As you see, there are no French-language fantasy books in our dataset. For more examples of GROUP BY applications, check out this article showing how GROUP BY can be leveraged in realistic business scenarios. Ready to Use GROUP BY in SQL? Now you have a better understanding of how GROUP BY can assist you with organizing and analyzing your data. However, if you want to handle GROUP BY like a pro, you need lots of practice. LearnSQL.com offers an SQL Basics course that includes 129 interactive exercises covering aggregation, grouping, JOINs, subqueries, and more. This course is a perfect opportunity to learn how to group rows and compute statistics with aggregate functions. Happy learning! Tags: sql learn sql GROUP BY ORDER BY You may also like Difference between GROUP BY and ORDER BY in Simple Words For someone who's learning SQL, one of the most common concepts that they get stuck with is the difference between GROUP BY and ORDER BY. Read more Using GROUP BY in SQL Master the powerful SQL GROUP BY command. Learn to compute statistical summaries with aggregate functions to solve real-world Business Intelligence challenges. Read more GROUP BY Clause: How Well Do You Know It? The GROUP BY clause is the most basic way to compute statistics in SQL. It can be quite tough for beginners but it is really powerful. Read more SQL Window Functions vs. GROUP BY: What’s the Difference? Window functions and GROUP BY may seem similar at first, but they’re quite different. Learn how window functions differ from GROUP BY and aggregate functions. Read more NULL Values and the GROUP BY Clause We've already covered how to use the GROUP BY clause but how does SQL's GROUP BY clause work when NULL values are involved? Find out! Read more Getting the Hang of the GROUP BY Clause GROUP BY is an important part of the SQL SELECT statement. But new SQL coders can run into some problems when this clause is used incorrectly. Here’s how to avoid those issues. Read more What Is the Difference Between a GROUP BY and a PARTITION BY? What is the difference between a GROUP BY and a PARTITION BY in SQL queries? When should you use which? You can find the answers in today's article. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.