
As you start coding in SQL, you will use some statements and techniques over and over again. We call these “SQL patterns”. This series will look at the most common SQL patterns and consider how to use them.
Pivoting in SQL refers to taking the data in table rows and making that data into columns. This is very important in reporting, and it’s easy to do when you use the CASE statement. In fact, you will see that pivoting data is very similar to the CASE
summarization pattern, which you can read about here. The idea is the same, but pivoting has a special place in our hearts because of its usefulness.
(Note: In this article, we will look at pivoting data using CASE
rather than a PIVOT
clause. Some RDBMSs have a dedicated PIVOT
command that functions like the method we present in this post. PIVOT
command is only available in recent releases of the enterprise level RDBMSs, when the PIVOT
is missing you can use this pattern. For more information on the PIVOT
command, check this out.)
Let’s revisit the sample table we used for the CASE
summarization pattern. It contains transactional data from a fictional small company.
The transactions
table has these columns:
id
– A unique identifier for each transactiondatetime
– The timestamp for the transactioncustomer
– The customer’s IDcreditcard
– The ID of the type of credit card usedamount
– The transaction amount, in dollarsaccount
– The customer’s account numbertype
– The transaction type
We use this query to get some sample data:
SELECT id, datetime, customer, creditcard, amount, account, type FROM transactions WHERE datetime BETWEEN TIMESTAMP '2017-01-01' AND '2017-05-01' ORDER BY customer;
And the data is:
id datetime customer creditcard amount account type 31 2017-05-01 00:00:00.000000 1 1 210 1 type_2 28 2017-02-01 00:00:00.000000 1 1 50 1 type_2 29 2017-03-01 00:00:00.000000 1 1 150 1 type_1 30 2017-04-01 00:00:00.000000 1 1 200 1 type_1 27 2017-01-01 00:00:00.000000 1 1 100 1 type_1 32 2017-01-01 00:00:00.000001 2 1 100 1 type_1 34 2017-03-01 00:00:00.000000 2 1 350 1 type_1 33 2017-02-01 00:00:00.000000 2 1 10 1 type_2 36 2017-05-01 00:00:00.000000 2 1 10 1 type_1 35 2017-04-01 00:00:00.000000 2 1 600 1 type_3 38 2017-04-01 00:00:00.000000 3 1 600 1 type_3 39 2017-05-01 00:00:00.000000 3 1 10 1 type_3 37 2017-01-01 00:00:00.000000 3 1 350 1 type_3
Understanding the Pivot SQL Pattern
Let’s now look at an example based on our transactional table data. We want to see the sum of all transactions from all customers divided by transactional types. As we have transactional type data in the type
column, we will need to change the rows to columns. . In this way, every new column will represent one data type.
The query is:
SELECT customer, SUM( CASE WHEN type = 'type_1' THEN amount ELSE 0 END ) AS sum_type_1, SUM( CASE WHEN type = 'type_2' THEN amount ELSE 0 END ) AS sum_type_2, SUM( CASE WHEN type = 'type_3' THEN amount ELSE 0 END ) AS sum_type_3 FROM transactions WHERE datetime BETWEEN TIMESTAMP '2017-01-01' AND '2017-05-01' GROUP BY customer ORDER BY customer ;
A look at the code reveals that we are using the CASE
pattern: when the value in the type column (type_1
) is the same as the value in the sum_type
column (sum_type_1
) the column’s value is added to the relevant sum_type
column. When it is not, we put a “0” in the column.
And the result set looks like this:
customer sum_type_1 sum_type_2 sum_type_3 1 450 260 0 2 460 10 600 3 0 0 960
Try It Yourself
In this case, we used transaction amounts as our pivot point. A similar query could count the number of particular types of transactions. You have the table definition in Vertabelo and the query illustrated in this post.
Try creating the counting query for yourself!
Did you miss our first post about SQL patterns? Read it here.