Back to articles list Articles Cookbook
6 minutes read

Grouping, Rolling, and Cubing Data

The first two articles in this series highlighted SQL Server's ability to organize data into user-defined windows and its aggregate functions. Part 3 will focus on other methods of aggregating and organizing data using built-in SQL Server features – specifically, its grouping functions.

For consistency, the same base data will be used as in the first two parts of this series.

CREATE TABLE WindowTable (WindowID INT IDENTITY, House Varchar(32), FullName Varchar(64), 
PhysicalSkill Decimal(4, 2), MentalSkill Decimal (4, 2))
GO

WindowTable table in Vertabelo

INSERT WindowTable
(House, FullName, PhysicalSkill, MentalSkill)
VALUES ('Stark', 'Robb Stark', 6, 6), ('Stark', 'Ned Stark', 8, 7), 
('Stark', 'Bran Stark', 2, 9), ('Stark', 'Arya Stark', 4, 6),
('Lannister', 'Jamie Lannister', 7, 5), ('Lannister', 'Tyrion Lannister', 3, 10),
('Lannister', 'Tywin Lannister', 3, 8)
GO

SQL Server's Grouping Functions

The GROUP BY clause is the basis for SQL Server's grouping functions: ROLLUP, CUBE, and GROUPING SETS. At its simplest, GROUP BY summarizes rows based on user-expressed conditions:

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 House from WindowTable
Group By House

SQL Server's Grouping Functions

In the above example, the SUM of PhysicalSkill and MentalSkill are calculated. The calculations are grouped by the House field.

Any fields not included in the GROUP BY clause must be contained within an aggregate function in the SELECT list; otherwise, an error will be generated. You can see this in the example below:

Select PhysicalSkill, 
 SUM(MentalSkill),
 House from WindowTable
Group By House

GROUP BY clause error message

To demonstrate the grouping functions, another column is needed in our sample set. We'll add it now:

ALTER TABLE WindowTable ADD Gender Char(1)
GO

Update WindowTable Set Gender = 'M'
Update WindowTable Set Gender = 'F' Where FullName = 'Arya Stark'
GO

With all this in place, we can start with our first grouping function, ROLLUP.

ROLLUP

ROLLUP generates a summary row for the fields described in the GROUP BY clause in a hierarchical fashion:

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
SUM(MentalSkill) as MentalSkillTotal,
House, Gender from WindowTable
Group By House, Gender WITH ROLLUP

ROLLUP

In the above example, a summary row is generated for each combination of House and Gender.

The Lannister House data, in the red box, has no records where GENDER = F. So the summary row (the second row, where Gender = NULL) contains summarized data for the Lannister males only.

There are two detail rows for the Stark grouping, shown in the blue box. Since there's one record where GENDER = F in the Stark dataset, there's one more row here than in the Lannister dataset. The summary row (where Gender = NULL) lists the combined values for the Stark men and women.

Values from both houses and genders are rolled up and presented in the green box.

Cleaning Data Presentation with GROUPING

This is a good place to temporarily interrupt our consideration of ROLLUP and discuss a handy SQL Server function that can clean up data presentations: GROUPING(). GROUPING returns a "1" when an expression is aggregated within the query, and a "0" when it is not:

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
CASE WHEN (GROUPING(House) = 1) THEN 'All Houses' Else House END as House,
CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender 
From WindowTable
Group By House, Gender WITH ROLLUP

Cleaning Data Presentation with GROUPING

By referring to the Gender and House fields using GROUPING, it's possible to display values other than NULL where an aggregation has been performed.

GROUPING should be used when creating final reports. For the remaining examples in this article, it will be excluded to make the groupings clearer.

Cleaning Data Presentation with GROUPING

Now, back to ROLLUP.

For each additional column included in ROLLUP, another summary row will be added in the same hierarchical fashion. To demonstrate this, another aggregatable column is added:

ALTER TABLE WindowTable ADD Class Varchar(32)
GO

Update WindowTable Set Class = 'Warrior' Where FullName IN ('Robb Stark', 'Ned Stark', 'Jamie Lannister')
Update WindowTable Set Class = 'NonWarrior' Where Class IS NULL
GO

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
House, Gender, Class from WindowTable
Group By House, Gender, Class WITH ROLLUP
GO

ROLLUP summary row added for additional column included in rollup

Class, at the bottom of the hierarchy, is summarized first. Gender is next, and the House summary is last. A single summary row is at the bottom.

Our next grouping function, CUBE, differs greatly. A summary row is created for each permutation of values. As additional grouping columns are added, this rapidly increases the number of summary rows.

Cube

In addition to the summary rows generated by ROLLUP, CUBE generates cross-tabulation rows for each possible combination, as shown in the red boxes below:

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
House, Gender
From WindowTable
Group By House, Gender WITH Cube

CUBE

The F and M summary rows are created in addition to the type of summary rows generated by ROLLUP.

GROUPING SETS

GROUPING SETS() is the most verbose and flexible aggregation method. Both CUBE and ROLLUP are conveniently expressed versions of GROUPING SETS.

GROUPING SETS allows the user to explicitly specify multiple levels of grouping and summarization.

In the following progressive examples, it becomes clear how GROUPING SETS can generate summarizations that were previously only possible using multiple SQL statements with UNION operators. The final () in each grouping sets indicates the generation of a totaled row.

Group by Gender only. Without the final (), this is the equivalent of a basic GROUP BY.

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender
 from WindowTable
Group By Grouping Sets( (Gender))

GROUPING SETS()

Example 1: Group by Gender only. The summary row is included.

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender
 from WindowTable
Group By Grouping Sets( (Gender), ())

Group by Gender only.  The summary row is included.

Example 2: Group by GENDER and HOUSE. This is similar to GROUP BY, but with the summary row included:

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender,
 CASE WHEN (GROUPING(House) = 1) THEN 'All Houses' Else House END as House
 from WindowTable
Group By Grouping Sets( (Gender, House), ())

Group by GENDER and HOUSE. Similar to GROUP BY, but with the summary row included

Example 3: Group by GENDER, and also group separately by HOUSE. This is similar to doing two separate GROUP BYs and using a UNION operator to combine the sets:

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender,
 CASE WHEN (GROUPING(House) = 1) THEN 'All Houses' Else House END as House
 from WindowTable
Group By Grouping Sets( (Gender), (House), ())

Group by GENDER, and also group separately by HOUSE.

Example 4: Group hierarchically by HOUSE, GENDER, and then HOUSE. This is the equivalent of specifying ROLLUP.

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 CASE WHEN (GROUPING(House) = 1) THEN 'All Houses' Else House END as House,
 CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender
 from WindowTable
Group By Grouping Sets( (House, Gender), (House), ())

Group hierarchically by HOUSE, GENDER, and then HOUSE.

Example 5: Group by all possible combinations. This is the equivalent of CUBE.

Select SUM(PhysicalSkill) as PhysicalSkillTotal,
 SUM(MentalSkill) as MentalSkillTotal,
 CASE WHEN (GROUPING(House) = 1) THEN 'All Houses' Else House END as House,
 CASE WHEN (GROUPING(Gender) = 1) THEN 'M and F' Else Gender END as Gender
 from WindowTable
Group By Grouping Sets( (House, Gender), (House), (Gender), ())

by all possible combinations.  Equivalent of CUBE.

GROUPING SETS is the most powerful of the functions we've discussed, but it lacks the conciseness and readability of ROLLUP and CUBE.

Each of SQL Server's grouping functions provides excellent functionality, and each can be used in reporting processes to easily summarize data with minimal impact on performance.