21st Aug 2020 9 minutes read Where Can I Find Free Online Data Sets to Practice SQL? Jakub Romanowski sql learn sql online practice Okay, you've installed the RDBMS of your choice. You know the basics of SQL and ... what's next? You would like to be able to practice SQL functions you learned in our course. Maybe you have even done our SQL Practice Set course and are hungry for more. To work with a database, first of all, you need—yes, this is not a surprise—a database! Where do you get the data for your SQL queries? They are available for free on the internet, but you need to know where to look. In this article, I’ll share where you can find some cool data sets for your SQL practice. Sounds good? Let's get started. First, a comment. Databases on the spread of COVID-19 around the world should be at the top of my list. There are tons of great databases that keep track of contagions and deaths. However, since they are so common today, I will not point to any specific one of those. Instead, I want to share with you other interesting data sets and have selected seven free sources of databases that are great for practicing SQL. 1. Google Trends Google collects petabytes of data. Every click, every query entered into the search engine, everything is tracked and saved. So why not use this for your SQL practice? Google Trends is one of the largest public data sets available. They can be compiled and analyzed practically freely. The sheer volume of data from which you can choose is overwhelming! Google allows you to use its browser data and analyze what people are searching for and when they do it most often. Sounds like something from one of Orwell's books? Let me just add that you can analyze almost every possible search query, keyword, and their history since 2004. To keep all this under control, various filters and data breakdowns are available. Thanks to this feature, in a few minutes you can narrow down your search to, for example, specific locations, time, or nature of data. Another interesting feature is the list of trends, i.e. the most popular terms currently searched on Google. You can also click on any of the examples suggested by Google. By writing this article, I learned that most of the searches related to Taylor Swift in the last 30 days were from Utah. I can't figure out why. Do you know? Write in the comment…. The annual rankings are also great. Google shows five search terms in several categories. It is also worth checking out the possibilities of data visualization by Google. Be warned, though—looking through all of this is really addictive and time-consuming! Each statement and each report can be conveniently downloaded with one click in the form of a CSV file. You can import them into your program and view them using SQL. The possibilities are limited only by your imagination and courage in entering words into the Google Trends search engine. For the first try, I recommend entering the phrase "Learn SQL" in Google Trends. See for yourself: is what we write about on the blogs only an empty promise, or does the world really need people who know SQL? I’ll give you a hint—the trend is growing! 2. Data.gov It is a gigantic, and more importantly, completely open and free collection of over 200,000 data sets from the US Government. The website offers a great search engine where you can define topics of interest, time intervals, tags, locations, and even the data file format or data type. In just a few clicks, you can access information about your city's budget or the average academic performance of students from your alma mater. You can easily find what you need. Take some time to dig deeper. Most of the data is offered in the most popular file formats such as JSON or CSV. A website like this is great, not only for SQL practice but also for democracy and transparency from the authorities. 3. FiveThirtyEight It is not just a collection of data sets. It's an ABC News' site with articles, ratings, and essays. You will also find a lot of data ready to be used in a SQL project. Are you interested in politics? See data from the U.S. presidential polls. You have access to data from many American research firms and think tanks; you can calculate averages and track changes. Biden or Trump? Find out who has more support currently in your home state. Each of the lists can be downloaded as a CSV file. Comfortable, interesting, and engaging. In addition to politics, you will also find a lot of information about sports (e.g. “The Pace Of Play Has Never Been Faster In The WNBA”), podcasts, and videos. FiveThirtyEight is one of the best websites of this type on the Internet. Have you come across a better one? Let me know in the comments. 4. Kaggle When you learn SQL and use it, you will have to get to know this service sooner or later. It is more than just a data set. Rather, it's a place where members of the data lover community come and publish their creations. You will find not only interesting data sets but also a whole lot of materials. All these can help you get better at understanding SQL and working with large databases. Kaggle has a simple search engine, which makes it easy to find what you are looking for. You can also use the hints or see what is currently the most popular. I am a sports fan, so I chose two databases for myself. The first is “international football results from 1872 to 2020.” It is a constantly updated database of results from over 40,000 international soccer matches. A huge dose of knowledge and historical statistics, with almost 150 years of soccer history in one database. It has to be impressive, and it is! By practicing SQL on it, you can, for example, compare the results of your national team from specific years or against the results of your most hated rivals. The second database I found here was the Lahman's Baseball Database. Remember the movie “Moneyball,” starring Brad Pitt? You can feel like the owner of a baseball team and complete your dream squad. Haven't seen Moneyball? Read my article “SQL, Databases, and Hollywood Movies.” Lahman's Baseball Database contains the complete batting and pitching statistics from 1871 to 2019. In addition, you also have fielding statistics, standings, team stats, managerial records, post-season data, and much more. Sounds cool? Because it is. The Kaggle service also provides you with chances to win some pretty good prizes. You get them for participating in competitions by developing predictive/classification models and competing with others on their results. 5. IMDb Data Set Do you like movies? Then you must be familiar with IMDb. It’s the world's largest online database on films, actors, directors, screenwriters, film agents, and other people associated with the industry. IMDb (The Internet Movie DataBase) was established 30 years ago. Since then, a huge global community has been developing the website. The database currently has entries for over 6 million movies, with data on over 100 million related entities in total. The website owners allow you to download their collections freely for personal use; you can't use them commercially. The data set is divided into smaller ones to make it easier. For example, you can download just the information about movies in a given language or just about a specific director. It’s all up to your imagination. As an example, try to find out the following: in how many movie titles does the word "learning" appear? Are you able to find out? 6. Airbnb Legend has it that Airbnb started from when its founders rented someone an air mattress in their living room. Since then, their business has grown. Now there are thousands of locations around the world. Their website allows people who have unused rooms or apartments to connect with travelers who need a place to stay for the night. The idea for this business is so simple that it’s hard to believe no one came up with it before. Airbnb has a database of its locations. You can download and use it for practicing SQL. Download, for example, all the data on Florence in Italian Tuscany. You can search all the properties for a good place to stay, analyze user ratings, and compare prices. Found your favorite? Then you already know where to stay when you go on a vacation there! In addition to the property lists, you can also download data that you can use in a GIS project. You don't know how and want to learn? I recommend a great PostGIS course on LearnSQL.com. PostGIS is a spatial extension of the PostgreSQL database. You will learn how PostGIS stores geographical data and how its basic geographical functions can be used in simple and complex SQL queries. 7. Earthdata I saved something really interesting for the last. With this service, you will gain access to data from NASA. Okay, so you won't find out if a UFO actually landed in Roswell. But you can learn a lot about the earth's atmosphere, solar radiation, ocean currents, storms, and tectonic movements. You can watch everything live or analyze it as databases. Earthdata is part of the Earth Science Data Systems Program. As a regular user, of course, you won't get access to all NASA resources. But you have access to petabytes of data collected by scientists around the world on an ongoing basis. Want to see how the Antarctic snow cover has changed over the past month? No problem. Perhaps you are more interested in massif movements in central Asia? Or the air currents over New York? You can retrieve and process data, all while honing your SQL skills. You can also view them live on the site. The sky's the limit, the pun fully intended! Data Sets for SQL Practice These are my picks for the cool data sets available online. There are many more like these. You are limited only by time and your will to act! Remember that learning SQL is one thing, but you have to continue to practice afterward so that you don’t forget what you learned in the courses. If you are new to SQL, I recommend our beginners’ course, SQL Basics. You will find everything you need to get started. It is really well constructed and well thought out, so you'll quickly grasp what it is all about. If you already know SQL and want to develop further, I have an Advanced SQL track for you. There, you will learn how to use CTEs and window functions, among other things. Start learning today! Tags: sql learn sql online practice