Back to articles list August 9, 2016 - 5 minutes read Window Functions: Part Two: Aggregating Data Jeffrey J. Keller Tags: SQL SQL Server window functions In part one of this series on SQL Server window functions, the functionality of the OVER() clause was discussed. This article will focus on aggregate functions. The majority of these can be used in conjunction with OVER. Some, like SUM() and AVG(), are widely used. Others — including VARP() and CHECKSUM_AGG() — are not as well-known, but are potentially quite useful. OVER defines the user-specified range on which a function is applied. This can be combined with many SQL functions, allowing easy separation of data into defined windows within a single SQL query. In the examples below, we will use the same base data as in the first article. The attributes PhysicalSkill and MentalSkill will be aggregated, often based on the House group. CREATE TABLE WindowTable ( WindowID INT IDENTITY, House Varchar(32), FullName Varchar(64), PhysicalSkill Decimal(4, 2), MentalSkill Decimal (4, 2) ) GO INSERT WindowTable (House, FullName, PhysicalSkill, MentalSkill) VALUES ('Stark', 'Robb Stark', 6, 6), ('Stark', 'Ned Stark', 8, 7), ('Stark', 'Bran Stark', 2, 9), ('Stark', 'Arya Stark', 4, 6), ('Lannister', 'Jamie Lannister', 7, 5), ('Lannister', 'Tyrion Lannister', 3, 10), ('Lannister', 'Tywin Lannister', 3, 8) GO The Commoners: SUM, MIN, MAX, COUNT, and AVG The aggregate functions SUM(), MIN(), MAX(), COUNT(), and AVG() are the most commonly seen in SQL. Let's look at each one in some detail. SUM The most basic of aggregate functions, SUM operates only on numeric data. NULL values are automatically ignored. In Part 1 of this series, SUM was used in most examples due to its simplicity and intuitive nature. In this example, SUM is partitioned by house. It will add up totals only within the window defined by that partition: SELECT House, FullName, PhysicalSkill, SUM(PhysicalSkill) OVER (PARTITION BY House) AS TotalPhysicalSkillByHouse, MentalSkill, SUM(MentalSkill) OVER (PARTITION BY House) AS TotalMentalSkillByHouse FROM WindowTable A lesser-known functionality of SUM is that it takes an additional argument: either ALL or DISTINCT. If this argument is not specified, ALL is used by default. DISTINCT will cause the function to include distinct values only. So if a number is found twice in the dataset, it will only be included in the aggregation once. SELECT House, FullName, PhysicalSkill, MentalSkill FROM WindowTable SELECT House, SUM( ALL PhysicalSkill) AS TotalPhysicalSkillByHouse , SUM( DISTINCT PhysicalSkill) AS TotalPhysicalSkillByHouseDistinct, SUM( ALL MentalSkill) AS TotalMentalSkillByHouse, SUM( DISTINCT MentalSkill) AS TotalMentalSkillByHouseDistinct FROM WindowTable GROUP BY House In the above example, PhysicalSkill contains '3' twice, and MentalSkill contains '6' twice. when the DISTINCT argument is used with SUM, only one "3" and one "6" is included. MIN and MAX Interestingly, MIN and MAX also take the DISTINCT and ALL arguments, although they serve no purpose in the following context: SELECT House, FullName, PhysicalSkill, MentalSkill FROM WindowTable SELECT House, MIN( ALL PhysicalSkill) AS MinPhysicalSkillByHouse , MIN( DISTINCT PhysicalSkill) AS MinPhysicalSkillByHouseDistinct, MAX( ALL MentalSkill) AS MaxMentalSkillByHouse, MAX( DISTINCT MentalSkill) AS MaxMentalSkillByHouseDistinct FROM WindowTable GROUP BY House AVG, COUNT, and COUNT_BIG All three of these functions also take the DISTINCT and ALL arguments. In this slightly more complicated example, ROWS UNBOUNDED PRECEDING is used to generate a running total for each of the three functions: SELECT House, FullName, MentalSkill, AVG(MentalSkill) OVER (PARTITION BY House ORDER BY House ROWS UNBOUNDED PRECEDING) AS MentalSkillRunningAverage, COUNT(MentalSkill) OVER (PARTITION BY House ORDER BY House ROWS UNBOUNDED PRECEDING) AS MentalSkillCountRunningTotal, COUNT_BIG(MentalSkill) OVER (PARTITION BY House ORDER BY House ROWS UNBOUNDED PRECEDING) AS MentalSkillCountBigRunningTotal FROM WindowTable COUNT_BIG differs from COUNT only in the datatype returned (INT vs. BIGINT). The Uncommon Functions: VAR, VARP, STDEV, STDEVP, and CHECKSUM_AGG In SQL Server, VAR and VARP calculate the variance of a dataset. VAR is a common mathematical function that measures the variance of a sample data set. VARP measures the variance against a population. This function is used when you're working with the entire data set, rather than just a sample set. The internal calculation is slightly different. STDEV calculates the standard deviation. The standard deviation and variance are tightly connected – STDEV is just the square root of the variance. Like VARP, STDEVP calculates the standard deviation against a population. For more information on the math involved, see this link. Since we know the entire population in our example, VARP and STDEVP can be used. With OVER, it's easy to see the average and standard deviation for each individual. We can also verify that the standard deviation is the square root of the variance using another SQL function, SQRT. SELECT House, FullName, PhysicalSkill, AVG ( ALL PhysicalSkill ) OVER (PARTITION BY house) AS APartition ByVGPhysicalSkillByHouse, VARP ( ALL PhysicalSkill) OVER (PARTITION BY house) AS VarPPhysicalSkillByHouse, STDEVP( ALL PhysicalSkill) OVER (PARTITION BY house) AS StDevPPhysicalSkillByHouse , SQRT (VARP ( ALL PhysicalSkill) OVER (PARTITION BY house) ) AS SquareRootOfVariance FROM WindowTable In SQL Server, CHECKSUM_AGG returns the checksum of all values in a group, ignoring NULLs. This is a simplified way to determine if the data in a row or group of rows has changed: SELECT House, FullName, PhysicalSkill, CHECKSUM_AGG ( ALL (CAST (PhysicalSkill AS INT)) ) OVER (PARTITION BY house) AS CheckSumAGG FROM WindowTable UPDATE WindowTable SET PhysicalSkill = 7 WHERE FullName = 'Arya Stark' GO SELECT House, FullName, PhysicalSkill, CHECKSUM_AGG ( ALL (CAST (PhysicalSkill AS INT)) ) OVER (PARTITION BY house) AS CheckSumAGG FROM WindowTable CHECKSUM_AGG is only compatible with the integer data type. HASHBYTES is a more powerful and flexible method for detecting changes in data, but it is not an aggregate function and cannot be used with OVER. Hashbytes requires the concatenation of the compared fields first. Aggregate functions have been available since the original versions of SQL Server, and more were added in subsequent releases. The inclusion of OVER in 2008 provided a new level of functionality and usability. In the next part of this series, we'll take a closer look at ranking functions. Tags: SQL SQL Server window functions You may also like SQL Window Functions Cheat Sheet This 2-page SQL Window Functions Cheat Sheet covers the syntax of window functions and a list of window functions. Download it in PDF or PNG format. Read more SQL Window Function Example With Explanations Interested in how SQL window functions work? Scroll down to see our SQL window function example with definitive explanations! Read more SQL Window Functions vs. GROUP BY: What’s the Difference? Window functions and GROUP BY may seem similar at first, but they’re quite different. Learn how window functions differ from GROUP BY and aggregate functions. Read more Common SQL Window Functions: Using Partitions With Ranking Functions Once you’ve learned such window functions as RANK or NTILE, it’s time to master using SQL partitions with ranking functions. Read more How to Practice SQL Window Functions at Home Find out how to approach and structure your SQL window function practice time. Learn about the obstacles you’ll face and how to avoid them. Read more SQL Window Functions By Explanation Window functions in SQL operate on a set of table rows and return a single aggregated value for each of the rows. Read more Window Functions in SQL Server: Part One: The OVER() Clause Window functions were first introduced in standard SQL 2003 and then extended in SQL 2008. Since SQL Server 2005, Microsoft has been steadily improving window functions in the database engine. These functions perform an aggregate operation against a user-defined range of rows (the window) and return a detail-level value for each row. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.