Articles Cookbook
Back to list
T-SQL

How to Group by Year in T-SQL

Database:

Operators:

GROUP BY, YEAR, OVER, PARTITION BY, SUM

Problem:

You want to group your data by year.

Example I:

One of the columns in your data is transaction_date. It contains a date. You would like to group all your data by year and calculate the total money earned each year.

The data table looks like this:

transaction_datemoney
2018-03-251700
2019-09-12100
2018-07-141200
2018-01-05400
2019-06-082000
2020-03-061500

Solution 1 (displaying the year and the money earned):

SELECT
  YEAR(transaction_date) AS year,
  SUM(money) AS money_earned
FROM data
GROUP BY YEAR(transaction_date);

The result is:

yearmoney_earned
20201500
20192100
20183300

Solution 2 (displaying the complete date, the year, and the money earned in the corresponding year):

SELECT
  transaction_date AS transaction_date,
  YEAR(transaction_date) AS year,
  SUM(money) OVER(PARTITION BY YEAR(transaction_date)) AS money_earned
FROM data;

The result is:

transaction_dateyearmoney_earned
2018-03-2520183300
2018-07-1420183300
2018-01-0520183300
2019-09-1220192100
2019-06-0820192100
2020-03-0620201500

Discussion:

In this example it's assumed that you don't have the year column. You have the column with complete dates and would like to retrieve the year from it.

To retrieve a year from the date in SQL Server, you can use the YEAR() function. The argument of this function should be a date – here, the transaction_date column.

If you'd like to display the year and the total money earned in this year, you can use a GROUP BY. The first selected column is the year extracted from the date. The second column is the aggregate function SUM(money). At the end of the query you need a GROUP BY YEAR(transaction_date). Hence Solution 1.

If you'd like to display more columns, you use a window function (Solution 2). After SUM(money) you write the OVER() clause and, since you'd like to group by each year, use PARTITION BY YEAR(transaction_date) inside it. Note that you don't yet have the year column when counting the sum, so PARTITION BY year won't work. You can read more about the window functions here.

Example II:

One of the columns in your data is year. You would like to group all your data by this column and calculate the total money earned each year.

The data table looks like this:

yearmonthdaymoney
20183251700
2019912100
20187141200
201815400
2019682000
2020361500

Solution 1 (displaying the year and the money earned):

SELECT
  year,
  SUM(money) AS money_earned
FROM data
GROUP BY year;

The result is:

yearmoney_earned
20201500
20183300
20192100

Solution 2 (displaying year, month, day and the money earned in the corresponding year):

SELECT
  year,
  month,
  day,
  SUM(money) OVER(PARTITION BY year) AS money_earned
FROM data;

The result is:

yearmonthdaymoney_earned
20183253300
20187143300
2018153300
20199122100
2019682100
2020361500

Discussion

In this example it's assumed that you already have the year column.

If you'd like to display only the year and the total money earned in this year, a simple GROUP BY is enough. If you don't feel comfortable with the concept of GROUP BY, take a look here where we explain it. You simply use the aggregate function (here: SUM) with the correct column and at the end of the query you group by year. You can rename the column using the AS keyword with a new name. Also, if you'd like to have the data sorted by year, use ORDER BY year at the end of your query.

It's more complicated if you'd also like to display some other columns. Then you need the solution using a window function (Solution 2). You should use the aggregate function with the appropriate column (SUM(money)) and write the OVER() clause afterwards. In this clause, you should use PARTITION BY with the column by which you'd like to group. That's how you get:

SUM(money) OVER(PARTITION BY year)

In this solution, you don't use a GROUP BY clause.

You can read more about the window functions here.

Recommended courses:

Recommended articles:

See also:

go to top