Back to articles list April 8, 2020 - 6 minutes read SQL Window Functions vs. GROUP BY: What’s the Difference? Ignacio L. Bisso Ignacio is a database consultant from Buenos Aires, Argentina. He’s worked for 15 years as a database consultant for IT companies like Informix and IBM. These days, he teaches databases at Sarmiento University and works as a PostgreSQL independent SQL consultant. A proud father of four kids with 54 years in his backpack, Ignacio plays soccer every Saturday afternoon, enjoying every match as if it’s his last one. Tags: sql learn sql window functions A very common misconception among SQL users is that there is not much difference between SQL window functions and aggregate functions or the GROUP BY clause. However, the differences are very significant. Perhaps the only similar point between GROUP BY and window functions is that both allow you to execute a function (such as AVG, MAX, MIN, or COUNT) on a group of records. I’d say that window functions’ special power is that they allow us to obtain results that otherwise would be almost impossible to achieve. In this article, we’ll go over the use of window functions vs GROUP BY and window functions vs aggregate functions. Quick Review of GROUP BY The GROUP BY clause allows us to group a set of records based on some criteria and apply a function (e.g. AVG or MAX) to each group, obtaining one result for each group of records. Let’s see an example. We have a table called employee with a total of five employees and three departments: Employee_NameDepartmentSalary John RobertsFinance2300 Peter HudsonMarketing1800 Sue GibsonFinance2000 Melinda BishopMarketing1500 Nancy HudsonIT1950 fig1: the employee table Suppose we want to obtain the average salary per department and the top salary for every department. We should use the following query: SELECT Department, avg(salary) as average, max(salary) as top_salary FROM employee GROUP BY department The image below shows the result: Departmentaveragetop_salary Marketing16501800 Finance21502300 IT19501950 GROUP BY vs Window Functions When comparing window functions and GROUP BY, it’s essential to remember that GROUP BY collapses the individual records into groups; after using GROUP BY, you cannot refer to any individual field because it is collapsed. Later, we will talk in depth about this topic. For now, we’ll just mention that window functions do not collapse individual records. So, if you want to create a report with an employee name, salary, and the top salary of the employee’s department, you can’t do it with GROUP BY. The individual records of each employee are collapsed by the GROUP BY department clause. For this kind of report, you need to use window functions, which is the topic of the next section. If you want to go deep into the nuances of SQL GROUP BY and reports, we recommend our interactive course Creating Basic SQL Reports. The Power of Window Functions Window functions are a powerful feature of SQL. They allow us to apply functions like AVG, COUNT, MAX, and MIN on a group of records while still leaving the individual records accessible. Since the individual records are not collapsed, we can create queries showing data from the individual record together with the result of the window function. This is what makes window functions so powerful. Suppose we want to obtain a list of employee names, salaries, and the top salary in their departments. SELECT employee_name, department, salary, max(salary) OVER (PARTITION BY department) as top_salary FROM employee The next image shows the result: Employee_NameDepartmentsalarytop_salary John RobertsFinance23002300 Peter HudsonMarketing18001800 Sue GibsonFinance20002300 Melinda BishopMarketing15001800 Nancy HudsonIT19501950 In the previous query, we used a window function: max(salary) OVER (PARTITION BY department) as top_salary The window function is MAX() and we applied it to the set of records defined by the clause OVER (PARTITION BY department), which are the records with the same value in the department field. Lastly, we renamed the column top_salary. In the query result, we have rows for individual employees. If we used GROUP BY instead of window functions, we would have rows for each department. Window functions have a rather verbose syntax; if you want to go into details, I suggest the “Window Functions” course, which is a step-by-step tutorial that takes you through SQL window functions using examples and exercises. Window Functions vs Aggregate Functions When we compare window functions and aggregate functions, we note a super powerful feature on the window functions side: positional functions. They allow us to obtain a column value from other records in the same window. This is a really amazing ability, allowing SQL users to create complex reports in just a few lines. Let’s briefly discuss two of these functions: lead() and lag(). The lead() function returns the column value of the previous record in the window, while lag() returns the column value from the next record in the window. It is very important to have the window ordered by the right column if you want to use these functions. Let’s see an example of how we can use these functions. Suppose we have a table that stores company shares with their market values at a point in time. The table might look like this: share_symboltimestampvalue OILBEST2020-03-05 10:00120 OILBEST2020-03-05 12:00123 OILBEST2020-03-05 15:00122 BANKWEB2020-03-05 10:0091 BANKWEB2020-03-05 12:0087 BANKWEB2020-03-05 15:0099 fig2: the share table Suppose we want a report showing every share’s value with its previous value and the variation percentage related to the previous value. We can do it by using the lead() function to obtain the share’s previous value. Note that we use ORDER BY timestamp when defining the partition (i.e. the window of records). We’ll go back to this point later. SELECT share_symbol, timestamp, value, lag(value) OVER (PARTITION BY share_symbol ORDER BY timestamp ) AS previous_value, trunc(((value - (lag(value) OVER (PARTITION BY share_symbol ORDER BY timestamp )))*100)/value,2) AS percentage_variation FROM share Notice that the columns previous_value and percentage_variation are calculated columns. They use values from different records in the same table. share_symboltimestampvalueprevious_valuepercentage_variation OILBEST2020-03-05 10:00120 OILBEST2020-03-05 12:001231202.43 OILBEST2020-03-05 15:00122123-0.81 BANKWEB2020-03-05 10:0091 BANKWEB2020-03-05 12:008791-4.59 BANKWEB2020-03-05 15:00998712.12 When we use positional functions, it is very important to put an ORDER BY clause along with the PARTITION clause (as we did in the previous query). If we don’t use the right ORDER BY, the results can be wrong. Why? Because positional functions work based on the order of the records in the window. Let’s examine this a little more. The FIRST_VALUE() function returns a column value from the first record in the window. LAG(), as we know, returns the column value of the previous record in the window. Having the right window order is crucial; imagine what you’d get from these functions otherwise! In our example, we want the previous chronological market value for a specific share. Thus, we used ORDER BY timestamp. If we omit the ORDER BY or we order by another column, the result would be wrong. In some specific cases, positional functions can return wrong values because of a partially populated window. And there are more window functions, like RANK(), NTH_VALUE() and LAST_VALUE(). We don’t have space to cover all this here, but I suggest checking out this article explaining window functions and these window function examples to learn more. Closing the Window In this article, we explored the differences between window functions and GROUP BY. We looked at examples with several aggregate and window functions. We also talked about an important limitation of the GROUP BY clause, namely the “collapse of records”. This limitation is not present on window functions, allowing SQL developers to combine record-level data with window function results in the same query. Another advantage of window functions is their capability to combine query values from different records (from the same window) in the same row of the result set. If you are interested in learning more about window functions, I suggest the Window Functions course, where you can learn SQL window functions using interactive exercises and detailed explanations. Tags: sql learn sql window functions You may also like SQL Window Functions Cheat Sheet This 2-page SQL Window Functions Cheat Sheet covers the syntax of window functions and a list of window functions. Download it in PDF or PNG format. Read more What Is the Difference Between a GROUP BY and a PARTITION BY? What is the difference between a GROUP BY and a PARTITION BY in SQL queries? When should you use which? You can find the answers in today's article. Read more SQL Window Functions vs. GROUP BY: What’s the Difference? Window functions and GROUP BY may seem similar at first, but they’re quite different. Learn how window functions differ from GROUP BY and aggregate functions. Read more HAVING vs. WHERE in SQL: What You Should Know To take advantage of SQL’s great power, you must understand HAVING vs. WHERE clauses. How do you use them? What are their differences? Read more Difference between GROUP BY and ORDER BY in Simple Words For someone who's learning SQL, one of the most common concepts that they get stuck with is the difference between GROUP BY and ORDER BY. Read more You Want to Learn SQL? You've Come to the Right Place! If you want to learn SQL basics or enhance your SQL skills, check out LearnSQL.com for a wide range of SQL courses and tracks. Read more SQL Window Function Example With Explanations Interested in how SQL window functions work? Scroll down to see our SQL window function example with definitive explanations! Read more SQL Course of the Month – Window Functions Find out why you should learn SQL window functions in April and why you should do it in our course. Read more Common SQL Window Functions: Using Partitions With Ranking Functions Once you’ve learned such window functions as RANK or NTILE, it’s time to master using SQL partitions with ranking functions. Read more How to Use Rank Functions in SQL In this article, you’ll learn how to use rank functions in SQL. It’ll give you a solid foundation for getting deeper into SQL window functions. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.