SQL Project for Beginners: AdventureWorks Sales Dashboard
Table of Contents
- Exploring the AdventureWorks Database
- Scoping Your SQL Project
- What Will We Include in Our SQL Project Dashboard?
- Create Your Own SQL Project for Beginners!
Building an SQL project on your own is a great way to sharpen your skills and gain practical experience. This SQL project example for beginners will show you how to create a sales dashboard with SQL and the AdventureWorks sample database.
If you're an SQL beginner who wants to take their skills beyond simple query exercises, it’s a good idea to tackle an SQL data analysis project. Not only will it challenge you to apply what you've learned, it will also give you a deeper understanding of how SQL works in real-world scenarios. This is your chance to move from basic practice to creating something tangible that showcases your growing expertise!
In this article, we’ll walk you through the steps of building a SQL project using the AdventureWorks database. We’ll cover everything from the project idea to writing the final queries. You'll learn how to approach each step of the process, and the concepts that we’ll use can be transferred to any SQL project.
Once you are finished, take a look at our SQL Databases for Practice course, which has 6 different databases for you to practice creating reports with. Alternatively, you can read up on how to find free datasets for your own SQL project and follow along with your own database!
Now, let’s take a look at the database we will be using in this SQL example project.
Exploring the AdventureWorks Database
The AdventureWorks database is a sample database created by Microsoft for SQL Server. It has been ported to many databases – including PostgreSQL, which we’ll be using in this article. The database includes 68 tables that describe a fictional bicycle manufacturer and contains data about different types of transactions occurring during business operations. Because AdventureWorks is so big, it’s a perfect dataset for beginners who want to practice SQL in a real-world setting.
The database is divided into 5 schemas, each representing a different field of operations: Production
, Purchasing
, Sales
, HR
, and Person
. In this project, most of our data will come from the Sales
schema. We will also use other schemas to get some additional information.
Let’s look at the tables we’ll use the most:
SalesOrderHeader
: This is the biggest table in the database. It stores all information relating to an order as a whole. It will be the most common starting point for sales-related queries.Product
: This stores extensive information about products offered by the company.SalesOrderDetail
: This table connects theSalesOrderHeader
andProduct
tables, storing information about the individual products that make up each order.ProductReview
: This stores customers’ reviews for specific products.Store
: This table stores basic information about each store. Most data is stored in the Demographics column in XML format; we will not be using it in this project.SalesTerritory
and CountryRegion: We’ll use these two tables together to get the name of the country with which the order is associated.
There are many other tables in the database, but don’t worry; we will be introducing them and their important columns as needed.
Scoping Your SQL Project
In this project, we want to build a sales dashboard for AdventureWorks. But how do we begin a project like this? A great starting point is to determine the scope of the project. Often, this means creating a list of questions you want to answer with the data. Our starting list of questions is:
- What are the total monthly sales?
- What are the monthly sales by country?
- Which products are our top sellers?
- Which stores perform the best?
- What is the average size of each order?
- What is the average customer lifetime value in each country?
As we dive into the data and write our queries, we may refine these questions and adjust our queries accordingly.
This is how you should start all your SQL projects: write a list of the questions you have for the data and then write the queries that will give you the answers. As you work through the queries and data, new questions will come up. Try to answer them with your queries too.
Now we’re ready to start writing the queries for our sales dashboard! Keep our SQL Basics Cheat Sheet handy in case you need a quick refresher on syntax.
What Will We Include in Our SQL Project Dashboard?
Report 1: Monthly Sales
Question: What is the company’s revenue for each month?
The first report in our dashboard shows the revenue for all past months (across every year). This query can be used to plot trend lines of income, or it can be analyzed as-is. We want to view the data in chronological order, starting with the most recent dates.
Approach
First, we choose the table our query will be based on. We would like to show three columns: the year and month of sale and the total revenue for that month and year. We need the total order amount and the date of sale, so the SalesOrderHeader
table from the Sales
schema is an obvious choice.
To display the year and month, we’ll use the EXTRACT
function:
EXTRACT(< MONTH / YEAR > FROM OrderDate) |
We will use these columns when grouping and ordering the result.
To get the total revenue for that specific month, we can use SUM(TotalDue)
together with grouping by the previous two columns. This will lead us to have separate SUM()
results for each unique year-month pair.
Here’s the final query:
SELECT EXTRACT( YEAR FROM OrderDate) AS OrderYear, EXTRACT( MONTH FROM OrderDate) AS OrderMonth, ROUND( SUM (TotalDue), 2) AS TotalRevenue FROM Sales.SalesOrderHeader GROUP BY EXTRACT( YEAR FROM OrderDate), EXTRACT( MONTH FROM OrderDate) ORDER BY OrderYear DESC , OrderMonth DESC ; |
Report 2: Monthly Revenue by Country
Question: What is the monthly revenue for each country?
We would like to see the total revenue for each month (of each year) for every country in the database. The result should be sorted in chronological order, with the more recent dates coming first. We can use this data to plot trend lines for each country or analyze the data as-is.
Approach
This query is similar to the previous query – we just have to add information about the country for each purchase. How can we do that?
The SalesOrderHeader
table has a column named TerritoryId
, which refers to the SalesTerritory
table. This table gives us access to the CountryRegionCode
column.
Country names are stored in the CountryRegion table of the Person
schema. We can join this table to the SalesTerritory
table using the region code. This leaves us with a clear way to expand our query.
Joining the tables like this SalesOrderHeader
-> SalesTerritory
-> CountryRegion
gives us access to the CountryRegion.Name
column. We can add it as the first column in the SELECT
statement and add it at the end of both the GROUP BY
and ORDER BY
.
Take a look at the final query:
SELECT cr. Name AS Country, EXTRACT( YEAR FROM OrderDate) AS OrderYear, EXTRACT( MONTH FROM OrderDate) AS OrderMonth, ROUND( SUM (TotalDue), 2) AS TotalRevenue FROM Sales.SalesOrderHeader soh JOIN Sales.SalesTerritory st ON soh.TerritoryId = st.TerritoryId JOIN Person.CountryRegion cr ON cr.CountryRegionCode = st.CountryRegionCode GROUP BY cr. Name , EXTRACT( YEAR FROM OrderDate), EXTRACT( MONTH FROM OrderDate) ORDER BY OrderYear DESC , OrderMonth DESC , Country; |
Here is a portion of the output:
Country | OrderYear | OrderMonth | TotalRevenue |
---|---|---|---|
Australia | 2014 | 6 | 9958.82 |
Canada | 2014 | 6 | 11471.62 |
France | 2014 | 6 | 3660.88 |
Germany | 2014 | 6 | 3818.44 |
Report 3: Bestselling Products
Question: What Are Our Bestselling Products?
We would like to add a list of the bestselling products together with some relevant information like lifetime sales and customer reviews. The report should show reasonably recent data.
Approach
The SalesOrderDetail
table, which holds the quantity of each item sold, will be our starting point for this query. We’ll build it incrementally.
The main metric in this query is the total quantity sold for each product, which we can obtain by using the combination of SUM(OrderQty)
and GROUP BY ProductId
.
However, the result of the query would be quite hard to read, as the products are only identifiable by their Id
. The products’ names are stored in the Product
table of the Production
schema. Joining the two tables using ProductId
and adding ProductName
leaves us with this query:
SELECT p.ProductId, p. Name AS ProductName, SUM (od.OrderQty) AS TotalUnitsSold FROM Sales.SalesOrderDetail od JOIN Production.Product p ON od.ProductID = p.ProductID GROUP BY p. Name , p.ProductId ORDER BY TotalUnitsSold DESC LIMIT 10; |
ProductId | ProductName | TotalUnitsSold |
---|---|---|
712 | AWC Logo Cap | 8311 |
870 | Water Bottle - 30 oz. | 6815 |
This query is a viable dashboard element, but it can be improved.
We have already laid down the foundation to work with products. Now, we want to add the ratings to our products. Let’s build it as a separate query, which we will later join with the main query.
Product reviews are stored in the ProductReview
table of the Production
schema. We mainly care about the ProductId
and the Rating
left by the customer, so we’ll average the rating scores and group the data by ProductId
.
Now we have a neat one-to-one mapping between the product and its ratings. Let’s round the result to one decimal place, in the usual review fashion:
SELECT ProductId, ROUND( AVG (Rating), 1) AS ProductRating FROM Production.ProductReview GROUP BY ProductId; |
ProductId | ProductRating |
---|---|
937 | 3.0 |
798 | 5.0 |
709 | 5.0 |
We can now wrap this query into a common table expression (CTE) to use together with our main query. CTEs are a way to create a temporary result set – like a virtual table that only exists within the context of the query. If you want a full explanation of how CTEs work, check out our comprehensive guide to CTEs in SQL.
Now, here’s a possible hitch: there might not be reviews for all the products. So, when we join Product
from the main query with the CTE, be sure to use a LEFT JOIN
so as to not lose any data.
It might be beneficial to only look at the most recent data. We can filter out old sales in the main query using the OrderDate column and subtract two months from the current date using INTERVAL ‘2 MONTHS’
. Then we can filter the data to only show the orders placed after that date.
If you decide to change the date range, modifying the INTERVAL
part is quite easy. To learn more about the INTERVAL
syntax, take a look at our Standard SQL Functions course.
This is the full query:
WITH ProductRating AS ( SELECT ProductId, ROUND( AVG (Rating), 1) AS ProductRating FROM Production.ProductReview GROUP BY ProductId ) SELECT p.ProductId, p. Name AS ProductName, SUM (od.OrderQty) AS TotalUnitsSold, pr.ProductRating AS ProductRating FROM Sales.SalesOrderDetail od JOIN Sales.SalesOrderHeader oh ON od.SalesOrderId = oh.SalesOrderId JOIN Production.Product p ON od.ProductID = p.ProductID LEFT JOIN ProductRating pr ON pr.ProductId = p.ProductId WHERE OrderDate > CURRENT_DATE - INTERVAL '2 MONTH' GROUP BY p. Name , p.ProductId, pr.ProductRating ORDER BY TotalUnitsSold DESC LIMIT 10; |
ProductId | ProductName | TotalUnitsSold | ProductRating |
---|---|---|---|
921 | Mountain Tire Tube | 187 | N/A |
873 | Patch Kit / 8 Patches | 181 | N/A |
870 | Water Bottle - 30 oz. | 168 | N/A |
922 | Road Tire Tube | 160 | N/A |
878 | Fender Set - Mountain | 107 | N/A |
711 | Sport-100 Helmet, Blue | 96 | N/A |
712 | AWC Logo Cap | 95 | N/A |
708 | Sport-100 Helmet, Black | 93 | N/A |
923 | Touring Tire Tube | 91 | N/A |
871 | Mountain Bottle Cage | 84 | N/A |
Now this is a list worthy of being displayed on a representative’s dashboard! An understanding of sales trends is something every employee should have access to – and now we have a query that delivers it.
Our dashboard is growing; let’s add some more metrics!
Report 4: Top-Performing Stores
Question: What Are the Top 10 Stores by Sales for the Past 2 Months?
We’d like to show the top ten stores – including online stores – and their respective revenue gain for the past two months. A “top store” is defined as having the most revenue over the past 2 months. Treat all online stores as one entity.
Approach
As usual, we will be starting with the SalesOrderHeader
table. To find out which store is responsible for which sale, let’s take a look at the Store
table in the Sales
schema.
Each store can be linked to a sale via the SalesPersonId
column. All sales where SalesPersonId
is NULL
also have their OnlineOrderFlag
set to true. This will make the job of grouping and displaying online sales easier, as we can treat any sale where SalesPersonId
is NULL
as an online sale.
To not lose data when joining the tables on SalesPersonId
(which is sometimes NULL
), use a LEFT JOIN
between SalesOrderHeader
and Store. In the SELECT
statement, we care about two things: the name of the store, and the total sales made at that store. Don’t forget to wrap the Store
.Name column with COALESCE(..., ‘Online’)
to make sure that all online orders have a nice readable label.
To filter out old data, use the same WHERE condition as in the last report:
WHERE OrderDate > CURRENT_DATE - INTERVAL ‘2 MONTHS’. |
Order the results by the total sales amount (showing larger values first) and use LIMIT 10
to only return the top 10 stores by revenue in the selected period.
Here’s the query:
SELECT COALESCE (s. Name , 'Online' ) AS StoreName, ROUND( SUM (so.TotalDue), 2) AS TotalSalesAmount FROM Sales.SalesOrderHeader so LEFT JOIN Sales.Store s ON so.SalesPersonId = s.SalesPersonId WHERE OrderDate > CURRENT_DATE - INTERVAL '2 MONTHS' GROUP BY s. Name ORDER BY TotalSalesAmount DESC LIMIT 10; |
StoreName | TotalSalesAmount |
---|---|
Online | 2419079.86 |
Friendly Bike Shop | 577136.64 |
Exceptional Cycles Services | 495918.62 |
Number One Bike Co. | 495918.62 |
Report 5: Revenue Sources
Question: How Does Online Revenue Compare with Offline Revenue?
For this dashboard metric, we would like to include data about how many orders were placed online vs. at a store. The final results should be broken down by month and year.
Approach
Once again, we will build this query iteratively. To split the sales into online and offline groups, we can use the OnlineOrderFlag
from the Sales.SalesOrderHeader
table. This table also conveniently includes the total price of the order, so this metric can be calculated from one table.
However, the OnlineOrderFlag
is of the BOOLEAN data type; using it in the GROUP BY
clause will leave us with true
and false
groups. To make the data more readable, we can use a CASE WHEN
expression to replace all the trues with “Online” and all the falses with “Store”. And since the OnlineOrderFlag
already evaluates to BOOLEAN
, we don’t need to modify it to use it as a condition. For a more in-depth explanation on how we are using CASE WHEN
in this example, read our guide on how to use CASE WHEN with GROUP BY.
Now that we have proper group names, we can select the other metrics that compare the revenue sources: the total number of deals closed and the total revenue from those deals. You can find the first one by simply counting all the SalesOrderIds and the second one by summing up all the TotalDue values. The grouping can be done using the bare OnlineOrderFlag
and not the CASE WHEN
expression; in this usage, we are simply mapping values.
Here is how the current query looks:
SELECT CASE WHEN OnlineOrderFlag THEN 'Online' ELSE 'Store' END AS OrderOrigin, COUNT (SalesOrderId) AS TotalSales, SUM (TotalDue) AS TotalRevenue FROM Sales.SalesOrderHeader GROUP BY OnlineOrderFlag ORDER BY TotalRevenue DESC ; |
OrderOrigin | TotalSales | TotalRevenue |
---|---|---|
Online | 27659 | 3244139 |
Store | 3806 | 90775446 |
We can clearly see a conclusion: Physical stores make almost 10 times less sales but produce 3 times as much revenue as the online store.
We can make this data more meaningful by filtering it by year and month so we can see trends in the revenue streams. The Sales.SalesOrderHeader
table includes the OrderDate
column, which we can use to obtain just the YEAR
and MONTH
parts as we did in the first report. If we add this as extra columns and use them in the GROUP BY
, we will see that both “Store”
and “Online”
groups have been split into different years and months. We can additionally filter the dates, like we did before with CURRENT_DATE
and INTERVAL
.
Here’s the full query and a sample of the results:
SELECT CASE WHEN OnlineOrderFlag THEN 'Online' ELSE 'Store' END AS OrderOrigin, EXTRACT( YEAR FROM OrderDate) AS OrderYear, EXTRACT( MONTH FROM OrderDate) AS OrderMonth, COUNT (SalesOrderId) AS TotalSales, SUM (TotalDue) AS totalRevenue FROM Sales.SalesOrderHeader WHERE OrderDate > CURRENT_DATE - INTERVAL '2 MONTHS' GROUP BY EXTRACT( YEAR FROM OrderDate), EXTRACT( MONTH FROM OrderDate), OnlineOrderFlag ORDER BY OrderOrigin, OrderYear DESC , OrderMonth DESC ; |
OrderOrigin | OrderYear | OrderMonth | TotalSales | TotalRevenue |
---|---|---|---|---|
Online | 2014 | 6 | 939 | 54151,48 |
Store | 2014 | 5 | 12584 | 3850111,69 |
Online | 2014 | 5 | 406 | 2156071,52 |
Store | 2014 | 4 | 2 | 1428,61 |
Report 6: Average Order Size per Country
Question: What Is the Average Order Size?
We would like to display each country and the average order size for all orders coming from that country. Countries with bigger average order sizes should be shown first.
Approach
We have already handled getting the country name for each sale in one of the previous reports, so we can focus on the order sizes here. It’s important to note that we will have to aggregate two times in this query: First to get the order size for each order, and then to get the country average. We’ll start by getting the order sizes.
Each order can be identified by an entry in the SalesOrderHeader
table. It will have multiple respective entries in the SalesOrderDetail
table corresponding to each product in the order. Summing up the OrderQtys
from the SalesOrderDetail
table for each SalesOrderId
leaves us with neat order_number–order_size pairs.
We’ll use a couple of JOINs
to get the country names, forming a CTE like this:
WITH OrderSizes AS ( SELECT sod.SalesOrderId, SUM (OrderQty) AS ProductCount, cr. Name AS Country FROM Sales.SalesOrderHeader soh JOIN Sales.SalesOrderDetail sod ON sod.SalesOrderId = soh.SalesOrderId JOIN Sales.SalesTerritory st ON soh.TerritoryId = st.TerritoryId JOIN Person.CountryRegion cr ON cr.CountryRegionCode = st.CountryRegionCode GROUP BY sod.SalesOrderId, cr. Name ) SELECT * FROM OrderSizes; |
Now it’s just a matter of getting the average order size from the CTE and grouping the result by country. Don’t forget to order the data to show the largest average order sizes first.
Take a look at the complete query and its partial result:
WITH OrderSizes AS ( SELECT sod.SalesOrderId, SUM (OrderQty) AS ProductCount, cr. Name AS Country FROM Sales.SalesOrderHeader soh JOIN Sales.SalesOrderDetail sod ON sod.SalesOrderId = soh.SalesOrderId JOIN Sales.SalesTerritory st ON soh.TerritoryId = st.TerritoryId JOIN Person.CountryRegion cr ON cr.CountryRegionCode = st.CountryRegionCode GROUP BY sod.SalesOrderId, cr. Name ) SELECT Country, ROUND( AVG (ProductCount), 2) AS AverageOrderSize FROM OrderSizes GROUP BY Country ORDER BY AverageOrderSize DESC ; |
Country | AverageOrderSize |
---|---|
United States | 12.80 |
Canada | 12.14 |
France | 7.45 |
United Kingdom | 6.24 |
Germany | 5.01 |
Australia | 2.67 |
Report 6: Average Lifetime Customer Value per Region
Question: What Is the Average Customer Lifetime Value in Each Region?
Customer Lifetime Value (CLV) is an important metric showing how much the company can expect a newly acquired customer to spend during their entire relationship with the brand. CLV is calculated by summing the revenue from all the customer’s purchases. This is most useful when calculating the budget for marketing campaigns; it’s best shown grouped by the region of operations.
Approach
Let’s first build the query that calculates the average CLV for each customer individually. We already know how CLV is calculated: by summing all the TotalDue
values for each customer from the SalesOrderHeader
table. This will give us an intermediate result. We want to group it later by country, so it is beneficial to include some of that data in here too.
The Customer
table in the same schema has a TerritoryId
column, which we can use later to obtain more information. For now, let’s just add TerritoryId
to the CLV query as an additional column. This requires a simple join and an expansion of the GROUP BY
statement.
Here is how the query looks so far, together with some sample results:
SELECT cs.CustomerId, cs.TerritoryId, SUM (TotalDue) AS LifetimeRevenues FROM sales.Customer cs JOIN sales.SalesOrderheader ord ON cstm.CustomerId = ord.CustomerId GROUP BY cs.CustomerId, cs.TerritoryId; |
CustomerId | TerritoryId | LifetimeRevenue |
---|---|---|
26264 | 4 | 34.56 |
30052 | 4 | 21863.90 |
24416 | 1 | 106.16 |
26745 | 9 | 2135.37 |
Half of the problem is solved. What’s left is to group this at the country level and make the result more readable. Let’s use a CTE to save the results of this query.
We can group the CTE by TerritoryId
and show the average lifetime value of customers per territory. This gives a technically correct result, though the territories are unidentifiable. The actual names of countries are stored in the CountryRegion
table from the Person
schema. It is identifiable by the CountryRegionCode
key. In the Sales
schema, we can find that key in the SalesTerritory
table. So, we will join the result of the CTE, through the SalesTerritory
table, with the CountryRegion
table. Now we can show the country’s name instead of its numeric Id
. Don’t forget to edit the GROUP BY
accordingly.
Take a look at the final query and some of its results:
WITH CustomerLifetimeRevenue AS ( SELECT cstm.CustomerId, ord.TerritoryId, SUM (TotalDue) AS LifetimeRevenue FROM Sales.Customer cstm JOIN Sales.SalesOrderHeader ord ON cstm.CustomerId = ord.CustomerId GROUP BY cstm.CustomerId, ord.TerritoryId ) SELECT cr. Name AS Country, ROUND( AVG (clr.LifetimeRevenue),2) AS AvgLifetimeCustomerValue FROM CustomerLifetimeRevenue clr JOIN Sales.SalesTerritory tr ON clr.TerritoryId = tr.TerritoryId JOIN Person.CountryRegion cr ON cr.CountryRegionCode = tr.CountryRegionCode GROUP BY cr. Name ORDER BY AvgLifetimeCustomerValue DESC , cr. Name ; |
Country | AvgLifetimeCustomerValue |
---|---|
Canada | 10971.34 |
United States | 8627.27 |
France | 4403.33 |
United Kingdom | 4394.69 |
Australia | 3259.14 |
Germany | 3024.18 |
Create Your Own SQL Project for Beginners!
We have successfully created queries that can power an informative sales dashboard. However, this article is meant to inspire you to start your own SQL project. Let’s recap how to build an SQL project:
- Find an interesting dataset.
- Come up with a business problem for this dataset, like we did with this sales dashboard.
- Write questions related to the problem which the project will help answer.
- Build SQL queries for the questions you wrote.
If you need a bit more structure, we recommend our mini-course SQL Databases for Practice. It contains 6 different databases that you can use in your own SQL project as well as some questions and suggestions to get you started.
If you would like to see more SQL beginner project examples, check out our article about creating a Northwind Store Project for your portfolio.
Hungry for more general practice? Check out our massive SQL Practice Track. Happy learning!