Back to articles list Articles Cookbook
12 minutes read

SQL for Data Analysis: 15 Practical Exercises with Solutions

Are you new to the world of SQL and eager to unlock the power of data analysis?. In this article, we'll improve our SQL skills through 15 practical, hands-on exercises designed specifically for beginners. Because when it comes to SQL, practice truly makes perfect!

We’re going to shine a spotlight on the critical domain of data analysis, where SQL takes center stage. SQL, or Structured Query Language, plays a pivotal role in data analysis. It provides a robust framework for querying, transforming, and extracting invaluable insights from databases. If you're eager to delve deeper into the world of SQL and become a master of data analysis, you can explore additional information and resources in our article Learn SQL for Data Analysis.

What we aren’t going to do is bore you with theory. Instead, we firmly believe in the power of learning through hands-on experience. That's why we've curated a collection of 15 beginner-friendly SQL exercises that immerse you in the art of data analysis – all while utilizing a real-world dataset from a store.

These exercises are drawn from our comprehensive course, Basic SQL Practice: A Store, which offers a total of 169 interactive online exercises. The exercises in the article are grouped into different topics: single table queries, JOIN queries, ORDER BY queries, GROUP BY exercises, and other features. If you enjoy these exercises, you'll undoubtedly want to explore the course for a more extensive learning experience.

Now, get ready to roll up your sleeves and fully immerse yourself in the realm of SQL-powered data analysis!

Store Database Overview

The store database consists of six tables, each serving a unique purpose and structured for specific functions. These tables are pivotal to the SQL exercises at hand, and gaining a comprehensive understanding of their components is important. Below, we will look at a brief description of each table.

SQL for Data Analysis
  • employee: Contains employee details like ID, name, birth date, address, city, country, and immediate supervisor.
  • customer: Stores customer information, including customer ID, name, company, email, address, city, and country.
  • purchase: Records order details, including order ID, customer ID (who placed the order), employee (who serviced the order), total price, and purchase and shipment details.
  • purchase_item: Connects purchases with products via ID, product, unit price, and quantity.
  • category: Provides insights into product categories using category ID, name, description, and parent category ID.
  • product: Lists store products and includes product ID, product name, category ID, quantity per unit, unit price, units in stock, and product status.

Now that we have a better understanding of the tables in the store database, let's dive into some SQL exercises to help sharpen your data analysis skills.

Data Analysis SQL Exercises

Since practice is essential for developing and refining your SQL skills, these exercises will serve as valuable tools. We've even created a SQL for Data Analysis Cheat Sheet to make things easier for you! You might want to keep it handy on your journey through the following 15 SQL exercises.

Single Table Queries

In this section, we are going to focus on SQL queries involving a single table. These exercises will help you master the foundational art of retrieving, filtering, and working with data within a single dataset.

Exercise 1: All Products

Exercise: Display all data present in the product table.


FROM product;

Solution explanation: This simple query retrieves all the data from the product table. The asterisk (*) is used to indicate that we want to select all available columns in the table, essentially fetching all the data. After the FROM keyword, we give the name of the table we’re selecting from (here, the table product).

Exercise 2: Products with the Unit Price Greater Than 3.5

Exercise: Display product names for products with a unit price greater than or equal to 3.5.


Solution explanation: Using the WHERE clause, we filter for product_names with a unit price greater than or equal to 3.5. The WHERE clause restricts the rows returned by the query to only those meeting the specified criteria.

Exercise 3: Products with Specific Conditions for Category and Price

Exercise: Display data for all products in categories with an ID of 1 (Food) or 5 (Fruits and Vegetables) and with a unit price above 3.5.


FROM product
WHERE (category_id = 1 OR category_id = 5)
  AND unit_price > 3.5;

Solution explanation: This query uses both the AND and OR operators to create complex conditions that filter products based on category and unit_price.

The OR operator allows products from either category 1 (Food) or category 5 (Fruits and Vegetables), while the AND operator ensures that selected products must have a unit price greater than 3.5. The use of brackets enhances query readability.

JOIN Queries

Joins are a fundamental concept in SQL. They are crucial for data analysis because they allow you to combine and analyze data from different sources, enabling comprehensive insights into your datasets. For more practice exercises involving joins, refer to our article SQL Joins: 12 Practice Questions with Detailed Answers.

Exercise 4: Products and Categories

Exercise: Select product names along with their categories. Display two columns: product_name and category_name.


  product_name, AS category_name
FROM product
JOIN category
  ON product.category_id = category.category_id;

Solution explanation: In this exercise, an INNER JOIN combines data from the product and category tables. An INNER JOIN selects only the rows that have matching values in both tables. This ensures that only products with corresponding categories are retrieved, creating a meaningful and accurate result.

The ON part of the query defines the relationship between the two tables. Additionally, the use of aliases (AS category_name) allows us to rename the columns as required by the question.

Exercise 5: Purchases and Products

Exercise: For each purchase, display the purchase ID, product name, unit price at the time of purchase, and quantity of each product's items.


FROM purchase_item
JOIN product
  ON purchase_item.product_id = product.product_id;

Solution explanation: This query combines data from the purchase_item and product tables, allowing us to display relevant purchase information. In SQL, INNER JOIN and JOIN are often used interchangeably, but they serve the same purpose: retrieving matching rows from both tables.

Exercise 6: Purchases and Categories

Exercise: For each purchase, display all product categories bought in this purchase. Show each category only once for each purchase.


  purchase_id, AS category_name
FROM purchase_item
JOIN product
  ON purchase_item.product_id = product.product_id
JOIN category
  ON product.category_id = category.category_id;

Solution explanation: In this exercise, we're joining three tables: purchase_item, product, and category. Joining more than two tables is achieved by sequentially connecting each table through their related columns. If you want to learn more about joining multiple tables, read How to Join 3 Tables (or More) in SQL.

The use of the DISTINCT keyword here ensures that each product category appears only once for each purchase, providing a clear and concise list of categories associated with every purchase. This can be especially useful when you want to avoid redundant information in your result set.

ORDER BY Queries

Sorting data is a crucial step in data analysis because it allows you to organize information in a structured and meaningful way. The following exercises will teach you how to arrange data meaningfully for reporting and analysis purposes.

Exercise 7: Sorted Categories

Exercise: Display all categories' data. Sort the categories by name in ascending (A to Z) order.


FROM category

Solution explanation: In this exercise, we use the ORDER BY clause to sort the results from the category table in ascending order based on the name column. This arrangement allows you to view the categories in alphabetical order. The ASC keyword is optional here; it represents the default sorting order, which is ascending.

Exercise 8: Employees Sorted by Birth Date

Exercise: Show employees’ last names, first names, and birthdates. Sort results by employee age in ascending order (Hint: Use the birth_date column).


FROM employee
ORDER BY birth_date DESC;

Solution explanation: Since there is no age column in our table, we sort employees by birth_date in descending order using the DESC keyword. Sorting by birth_date in descending order allows you to view employees from the youngest to the oldest.

The use of DESC is crucial here, it reverses the default ascending sorting order and achieves the desired outcome.

Exercise 9: Products Sorted by the Number of Units

Exercise: Display data for all products, sorting the results by the number of units in descending order and product name in ascending order.


FROM product
ORDER BY units_in_stock DESC, product_name ASC;

Solution explanation: This query orders products first by units_in_stock in descending order, and then by product_name in ascending order. The first sorting condition ensures that products with the most units in stock appear at the top; products with the same number of units are further sorted alphabetically by name.

GROUP BY Queries

In this section, we explore SQL queries that involve grouping data for analysis. GROUP BY is essential for aggregating and summarizing information based on specific criteria. Check out GROUP BY in SQL Explained for more information.

Exercise 10: The Average Unit Price for Each Category

Exercise: For each product category, show its name and find the average unit price. Display two columns: name and average_unit_price.


  AVG(unit_price) AS average_unit_price
FROM product
JOIN category
  ON product.category_id = category.category_id

Solution explanation: In this exercise, we're calculating the average unit price for each product category. The query joins the product and category tables to associate products with their respective categories. The GROUP BY clause is then used to group the data by category_name. Using AVG(), we can calculate the average unit price within each category – giving us insights into the price distribution across different product categories.

Exercise 11: The  Number of Customers in Cities

Exercise: Count how many customers live in each city except for Knoxville and Stockton. Sort the results by the city name in ascending order. Display two columns: city and customers_quantity.


  COUNT(customer_id) AS customers_quantity
FROM customer
WHERE city <> 'Knoxville'
  AND city <> 'Stockton'
ORDER BY city;

Solution explanation: In this query, we use the WHERE clause with the <> operator (which means not equal to) to filter out records with the city name of Knoxville or Stockton. This filtering ensures that the analysis covers all cities except for the specified ones.

The COUNT() function then calculates the customer count for each of the remaining cities. The GROUP BY clause groups the data by city name, enabling us to count customers for each city.

Finally, the results are ordered in ascending order by city, providing a clear and organized view of customer distribution across various cities.

Exercise 12: The Number of Discontinued Products

Exercise: For each category, find the number of discontinued products. Show only the categories with at least three discontinued products. Sort the rows by the number of discontinued products in descending order. Display two columns: name (the name of the category) and discontinued_products_number.


  COUNT(product_id) AS discontinued_products_number
FROM product
JOIN category
  ON product.category_id = category.category_id
WHERE discontinued IS TRUE
HAVING COUNT(product_id) >= 3
ORDER BY COUNT(product_id) DESC;

Solution explanation: In this query, we utilize the HAVING clause to filter categories with at least three discontinued products. The HAVING clause is applied after the GROUP BY operation and allows us to filter the aggregated results based on the specific condition. For more information on using HAVING, read our articles The SQL HAVING Clause Explained and HAVING vs. WHERE in SQL: What You Should Know.

Additionally, you'll notice the use of the COUNT() function within the ORDER BY clause. This allows us to sort the rows based on the count of discontinued products in descending order.

The ORDER BY clause is flexible and can include aggregate functions, making it a powerful tool for arranging data based on aggregated values.

Other SQL Features

In this section, we delve into additional SQL features to expand your toolkit. We'll explore working with NULL values and cleaning up data using the ROUND() function. These features are crucial for handling complex data analysis scenarios.

Exercise 13: Employees with an Unknown Hire Date

Exercise: Display the last and first names of employees with an unknown hire date.


FROM employee
WHERE hire_date IS NULL;

Solution explanation: In this query, the WHERE clause with hire_date IS NULL allows us to filter and select records where the hire date is missing. This type of query can be a crucial analysis for identifying incomplete employee records or data entry errors within the dataset.

Exercise 14: Number of Employees with Unknown Birth and Hire Dates

Exercise: Count the number of employees with both unknown birth and hire dates.


  COUNT(*) AS employees_number
FROM employee
WHERE birth_date IS NULL
  AND hire_date IS NULL;

Solution explanation: In this query, we again use the IS NULL condition to filter for rows where the birth_date and hire_date are both missing. This allows us to COUNT() only those employees who lack both birth and hire date information.

Exercise 15: Percentage of Money Spent by the Customer on Purchase

Exercise: For each customer who made a purchase, display the ID of each purchase made by this customer and the percentage of money spent on that purchase relative to all the money spent by that customer.


  ROUND(total_price * 100.0 / (SELECT SUM(total_price) FROM purchase WHERE customer_id = p.customer_id)) AS percentage
FROM purchase p
JOIN customer 
  ON p.customer_id = customer.customer_id;

Solution explanation: In this solution, we use two queries. The main query incorporates a subquery within the SELECT statement to calculate the SUM() of total prices for purchases made by the same customer.

The percentage is then calculated using the provided formula and the ROUND() function is applied to round the result. This query is a powerful tool for gaining insights into the purchasing behavior of each customer.

Ready for More SQL Exercises?

Mastering SQL is an essential skill for data analysis – and the best way to learn it is by doing. In this article we went through a selection of 15 beginner-friendly SQL exercises drawn from our comprehensive course, Basic SQL Practice: A Store.

But these exercises are just the beginning. If you're eager for a more extensive learning experience, we invite you to explore our complete course that features 169 interactive exercises. For those who crave even greater challenges, this course is just one of the nine courses available on the SQL Practice learning path – each offering an engaging format.

We also have our SQL Monthly Practice track, which is regularly updated with fresh exercises to keep your skills sharp and up-to-date with the latest SQL developments.

Immerse yourself in more SQL exercises and you'll master the art of data analysis. Your journey to becoming an SQL expert begins with continuous practice. So, keep exploring and honing your skills. Success is just around the corner!