Back to articles list June 13, 2018 - 9 minutes read Key Insights for Mastering SQL Queries Using Tinder Data Example Aldo Zelen Aldo is a data architect with a passion for the cloud. From leading a team of data professionals to coding a data warehouse in the cloud, Aldo has experience with the whole lifecycle of data-intensive projects. Aldo spends his free time forecasting geopolitical events in forecasting tournaments. Tags: data analysis how to in sql SQL basics Going from zero to one can be daunting in any endeavor. The same is true for learning new programming languages, even simple ones like SQL. In this article, we’ll take a look at some key insights that will help you understand the nuances of mastering SQL queries. If you’ve never used SQL, you’re in the right place. When learning anything new, you’ll find that there are always some key insights or tips that can help you on your way. Mastering SQL queries is no different. Today, we’ll explore some ways you can accelerate your understanding of basic SQL queries, with visualization and analogies as the main drivers of mastery. With the basics covered, we’ll look at some important ways you can expand your SQL knowledge with exercises. This article is intended for complete newbies who have never worked with SQL in their lives. Laying the Groundwork with SELECT Let’s say you’ve installed Tinder and are looking at the collection of profiles available to you. The value you want to derive is simple: you’re interested in finding a date from among an array of profiles. But there are far too many profiles; you need to refine your search. Perhaps you prefer a tall man or a blue-eyed woman. Whatever your preference, you’ll need to define and refine your interests. The same process is true for the SELECT clause in SQL statements: you want to select some data and filter it by specifying certain criteria. So if you’d like to translate your Tinder request into a query, you’d write something like: SELECT NAME, SURNAME FROM TINDER_PROFILES; The best way to make the syntax stick is to read the statement aloud. Think of it as commanding the database to do something: There’s a reason why SQL is so intuitive: it’s a fourth-generation programming language (4GL). As such, it is really simple to use and understand, and so mastering SQL queries is relatively easy. If we would like to refine our search, we’d simply need to add a new clause to the statement to form a more complex query: SELECT NAME, SURNAME FROM TINDER_PROFILES WHERE eye_color = ‘blue’ and sex = ‘F’; In this statement, we’re simply refining our search to reduce the number of profiles we receive from the table named TINDER_PROFILES. How can you describe a database table? Think of it as a collection of every profile out there, where each profile has attributes like name, surname, sex, eye_color, and others. To make the statement more readable, just imagine that the equality sign reads as ‘is’ in the above query. One thing that confuses newbies is that the attributes we use to filter a table don’t always appear in the select list. But this is perfectly acceptable if you think about it—we’re only interested in selecting certain information, but we also need to apply filters to our search. Whether or not we actually select those filters is up to us. In JOINs We Find Strength Let’s move on to a slightly more difficult way to express ourselves that will help us in mastering SQL queries. This time, we’ll use multiple entities—or, as we call them in SQL, tables! Below is an example showing two related tables. One table contains our friends, and the other table contains their addresses. When we want to command the database to retrieve all of our friends who live in the city of London, we can write something like this: SELECT NAME, SURNAME FROM FRIENDS JOIN ADDRESSES ON (FRIENDS.ADDRESS_ID = ADDRESSES.ID) WHERE addressES.CITY = ‘London’ Let’s translate this query to regular English: The keyword here is JOIN. With a JOIN, we are commanding the database to connect two tables in some manner. The connection is established via identification. It’s simpler than it sounds. Basically, each friend has an address_id that refers to the city where they live. Each city has its own id. These two attributes are the same, but they’re just named differently and are found in two separate tables. Bear in mind that every friend lives only in one city, and each city has a unique id. If we had cities that shared IDs, we would get erroneous result sets in some queries. So let’s say that London is represented by an address_id of 25, and Mumbai has the same ID, 25. Our friend Steve lives at an address with an ID of 25. The query where we look for our friend’s city is: SELECT FRIENDS.NAME, ADDRESS.CITY FROM FRIENDS JOIN ADDRESSES ON (FRIENDS.ADDRESS_ID = ADDRESSES.ID) WHERE FRIENDS.NAME = ‘Steve’ The result set for the query would look like : Steve London Steve Mumbai As you see in the data name Steve is duplicated to two cities because we have not been diligent and we allowed the table of cities to have two cities with the same ID. My personal tip is that you should always check the underlying data to see if there are some erroneous rows before joining tables. Checking underlying data will always be part of your routine. Okay, so we have to be honest with you: structuring our database model like this—with the possibility of identical ids in the same table–is not good SQL database design. If you’re interested, you can learn more about how to create tables in SQL. It’s useful to know it while mastering SQL queries. This JOIN is an example of an INNER join, but there are many other kinds of SQL JOINs out there. Unions Are a Good Thing By now, you should be a little closer to mastering SQL queries that tell a database to return certain data. Now, we’ll look at how you can unify two different result sets. This is not to be confused with JOINing two tables. Let’s say you have a table that contains all the foods that you like; let’s call this table my_foods. We also have a table for every food that my friend Steve likes, aptly named steve_foods. How could we look at all the foods that we both enjoy? We could write this UNION: SELECT name FROM my_foods UNION SELECT name from steve_foods; This query returns food that both you and Steve like. There is one more thing to note here: the resulting names of foods do not contain any duplicates. Food that both you and Steve enjoy, like pizza, will only be listed once in the result set of this query. If we’d like to know what foods you and Steve both enjoy, we could seek the intersection of the two sets: SELECT name FROM my_foods INTERSECT SELECT name FROM steve_foods; Here’s a good hack: if we want to list all the foods that you and Steve like, regardless of whether they’re duplicates, we can use the UNION ALL clause, like so: SELECT name FROM my_foods UNION ALL SELECT name FROM steve_foods; TIP: Always use UNION ALL when possible, as it is much faster than UNION. Okay, so we have to be honest with you: structuring our database model like this—as two tables, my_foods and steve_foods—is not good SQL database design. If you’re interested, you can learn more about how to create tables in SQL. As I’ve already mentioned, it’s good to know it while mastering SQL queries. Aggregating Your Knowledge After mastering SQL queries using simple selecting, joining, and unifying, you should move on to learning how you can calculate certain values for entire data sets or groups within these data sets. Calculating a value for a column or data set is known as aggregation, and queries that use aggregation are known as aggregate SQL queries. Let’s say we want to calculate the average price we spend on different categories of personal items per month. These categories could be food, entertainment, travel, and so on. How would you go about answering this question in the real world? We would do something like this First, we would log every purchase throughout a given year. In this log, we would write the name of the item, price, category, and time of purchase. We would sum up the prices of all the items within their respective categories. We would then divide the total yearly item expenditure by 12 to get our monthly expenditure. To get the sum of all the items in a certain category, we would have to GROUP the items BY category. So the command to the database would be something like this: SELECT CATEGORY, SUM (PRICE)/12 AS AVERAGE_EXPENDITURE_OF_CATEGORIES FROM EXPENDITURES GROUP BY CATEGORY We tell the database to select the sum of the PRICE column from the EXPENDITURES table, divide that sum by 12, and name it average_expenditure_of_categories. And we group the results by category. Armed with information about how much we spend in each of these categories, we can now construct a budget. This special sort of SELECT query—one that calculates values like sums, averages, minimums, and so on for one or more rows—is known as an aggregate query. Learning to use them is yet another step of mastering SQL queries. Aggregation queries are always reductionist in the sense that the data sets they return always have fewer rows than the original table(s). The rows are returned in such a way that there is always one row for the group we are constructing. In our example, that means we have one row per category of item. It is important to note that we can SELECT only grouped categories and aggregate calculations like SUM, MIN, MAX and MEAN. This usually confuses newbies. A good trick to avoid syntactic errors when writing aggregation queries is to first write the query without the columns in the SELECT clause and to fill them in only after you have constructed the rest of the query. Mastering SQL Queries: What’s Next? In this short article, we looked at the basics of SQL queries and how you can gain insights into mastering SQL queries. We learned about simple SELECT statements that we can use to select useful information from a table. With JOINs, we expanded our knowledge to include more complex ways to talk to tables. UNIONs are useful for representing and working with our data in set form. With aggregation, we dove into the logic of constructing statements that calculate values over columns. But this is just the beginning of mastering SQL. With these tools, you can start your journey of learning SQL. There are many more useful features of the language that you can use to solve business problems and pet projects. So open your favorite SQL editor, try out the things we touched on in this article, and continue learning! Tags: data analysis how to in sql SQL basics You may also like SQL Filtering 101 Sometimes, SQL queries return lots of data you don't need. Look at a simple example of filtering SQL queries to reduce the complexity of your data. Read more Why Learn SQL Over the Summer For the love of SQL, we decided to motivate you (yes, you) to learn SQL over the summer. Here's why! Read more How to Become an SQL Developer Have you ever wondered what it takes to become an SQL developer? What does an SQL developer do, what are job requirements, and which training you should attend? Read more How Long Does It Take to Learn SQL? Top Tips for SQL Proficiency How long does it take to Learn SQL? We break it down so you know exactly how long you’ll need to achieve SQL proficiency. Read more Is SQL a Programming Language? SQL is a great tool for talking to relational databases. But is it a programming language? Learn why the answer is definitely yes. Read more Is SQL Worth Learning? Is learning SQL something to consider? Or is it yet another useless skill that employers don’t value? Read more How Will Learning SQL Improve My Daily Work? Do you think that learning SQL is only for people in IT? Not true! A knowledge of SQL is good for most professions. Read more How to Start Writing SQL Reports Learn how to write SQL reports. Reporting in SQL will benefit you professionally, regardless of your role or industry. Read more Should I Learn SQL as a Data Analyst? Are you a data analyst wondering if you should learn SQL? You should, and here's how! Read more What Programming Language Should You Learn? Guide to what programming language should you learn first when starting learning to code Read more Why Use SQL Over Excel If you're using Excel as a data analyst, you're missing something valuable. In this article, I explain why the use of SQL over Excel is the right choice. Read more Career Change: From Accountant to Database Designer Personal insight into career change, learning SQL, becoming a database designer, and how your life choices can help you accomplish all that and more. Read more Forget About Excel, High Five With SQL Find out how easy it is to learn SQL, why it can make you work better, and how LearnSQL.com makes the learning process fun. Read more 5 Great Benefits of Learning SQL Data analysis will help you and your business grow. What are the main benefits of learning SQL? You’ll find out in this article! Read more Why Is SQL Worth Learning? With an increasing amount of data comes the need for databases that can store information for retrieval and analysis by trained specialists. Read more Four Reasons You Must Learn SQL in Data Science SQL can help you build a solid foundation for your analytical career. Let’s see how seasoned professionals use SQL in data science. Read more SQL Mythbuster – 5 Reasons Why No One Should Be Afraid of SQL Anyone can learn SQL. It’s not as hard as you think! And you don’t have to be a hard-core nerd. Let me tell you my story. Read more Should I Learn SQL? 4 Convincing Reasons There are 250+ computer programming languages in popular use. However, there are good reasons why you should learn SQL, or Structured Query Language. Read more Our Picks for 2020’s 7 Best Online SQL Schools Looking for the best online SQL courses but aren’t sure where to start? I evaluated the top 7 ranked online SQL schools to help you find your match. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.