10th Nov 2021 10 minutes read A Chat with Anthony DeBarros, Author of Practical SQL Jakub Romanowski sql SQL books Table of Contents To Learn More About SQL Some make model ships, others collect stamps. I read SQL books. I've read a lot of them, but only a very few deserve to be called really good. Some of them were included in my Best SQL Books list. The good news is that one of them just got a new release. It was great, even better than the previous edition! Moreover, I was able to talk to its author, Anthony DeBarros. Without further ado, here's what he told me. Today we're talking to Anthony DeBarros, author of the brilliant book Practical SQL: A Beginner's Guide to Storytelling with Data. Thanks for being with us. The new second edition of your book will be available soon. When is the book's premiere, and what will we find inside? Thanks for asking! The second edition of Practical SQL will be released in February 2022, but the early PDF version is available now from my publisher, No Starch Press.All the code and data are on GitHub as well. The book’s all about learning SQL, starting with the basics of tables and queries. But along the way, I share a lot of insights about data analysis – things that I’ve learned to be mindful of during my career. Checking data for accuracy and completeness, for example. I wanted the book to be true to real life, so we use several actual data sets for the exercises. For the second edition, I added new material to just about every chapter – new techniques, additional examples, additional syntax and functions, more data sets. I also spend more time helping readers set up their coding environment, which is especially helpful for people who are new to coding. I had some great help and mentorship from Stephen Frost of Crunchy Data, who was my technical reviewer for this edition. He brought a lot of useful, practical insights and helped me avoid dumb mistakes. Who should read your book? Practical SQL is appropriate for beginners who’ve never done any programming. The language is very easy to grasp. But there’s also plenty of material that will challenge people with some SQL experience! The book covers working with GIS, storing and querying JSON data, and full-text search. These are typically more advanced topics. I’d also say that this is a good book for front-end developers or data scientists who need to build their back-end data skills. While writing the book, I often thought about people who need to work with data and find that a spreadsheet just isn’t robust enough to get the job done. You can do a lot with a spreadsheet; still, it really can’t handle big data sets. But SQL can! Most SQL books are pretty boring, like a different form of database documentation with some unrealistic examples. You've made learning SQL syntax interesting. What's your secret? Thanks for saying that! I write from the perspective of a teacher in a classroom. How would I keep a group of students engaged each week for a whole semester? To me, the answer is to have the book focus on real-life data analysis, with the joy of discovery and sometimes the pain of messy data getting in the way. So, I use real data and try to help readers discover insights about the data while they’re learning. And, as I’m sure you know, data can sometimes be lacking in terms of quality. We learn about that too, and how to deal with it. That’s very real. At the end of each chapter, you give your readers a project to complete. Do you think interactive SQL courses would complement your book nicely? I’m a big fan of learning from several angles and finding the best way you like to learn, so it’s possible. I’ve thought about developing courses based on my book, but I don’t have any immediate plans. Along with other authors on the LearnSQL.com blog, I’ve been trying to convince our readers for a long time that data analysis is more than just an Excel spreadsheet. What arguments do you use to convince people to use SQL? I say that spreadsheets are great, but anyone who works with data long enough will soon discover that they have some big limitations. One of them is how easy it is to create errors in data without knowing it. For example, if you double-click a CSV file and open it with Excel, the spreadsheet will automatically assume that a U.S. postal code like 04401 is a number and drop the leading zero! Or it might try to turn a product code of 3-2 into a date. This is so common that it’s one of the topics I cover in the book – how to clean up those kinds of messes. With SQL, you get to enforce rules about data. A column of numbers cannot hold text. Dates must be formatted a certain way, etc. SQL gives you the ability to make your data less prone to inconsistency. Spreadsheets also have practical limits. You won’t be able to work in Excel with a data set that has 200 million rows. Even R or Python can struggle at times if you try to load that much data at once into a dataframe. SQL databases are built to handle big data sets and to perform reasonably quickly, especially when you add indexes. Then there’s the relational aspect of SQL databases: the ability to create table relationships that reduce redundancy and make data management much easier. I cover this quite a bit in the book – not only in terms of how to implement relationships between tables, but also the advantages of relational data. There’s more I could say, but those are good starters. You have chosen PostgreSQL. Why? Well, my first concern is that I don’t want readers to have to pay for software or run into limitations if they use the free version of a paid product. That leaves a few choices for free, open-source SQL databases – mainly MySQL, SQLite, and PostgreSQL. All are great and have strong user communities. I’ve used each of them, but over the years I’ve been most attracted to PostgreSQL. That started about 10 years ago, when I was studying how to build websites with Django. PostgreSQL is very well integrated with that framework. I also like the open-source PostgreSQL GUI, pgAdmin. Ultimately, though, if I had to name one thing about PostgreSQL that makes it my first choice, it would be the PostGIS spatial data extension. It’s an incredibly powerful and useful tool that integrates well with QGIS and other GIS tools. There’s some great Python support for PostgreSQL as well. People often talk about being data-driven and data democratization. Such terms have become very popular lately. Is data that important? I think data is as important as creativity and intuition and study and good common sense in terms of understanding just about any topic. Someone who wants to really know the world around them – or understand economics or demographics or any of a host of topics – has to have a handle on what the data says. It’s one thing to hear what people are talking about and how they say they feel, and that’s important, but it’s also important to measure activity and quantify what’s actually happening, what people actually do, or how systems actually behave based on measurements. The ability to understand data, to get answers from data, is very important. How can SQL help people grow their business? Well, similar to what I said earlier, SQL is one tool for helping a business manage its data in a coherent and efficient way. It also helps them gain insights about their business activity that might not be apparent on the surface. You work for the Wall Street Journal, where you write about the economy, trade, demographics, and the Covid-19 pandemic. Do you use data analysis when preparing your articles? I am a data editor, so yes! You can see examples of stories I’ve worked on, on my Wall Street Journal page. I also worked as a journalist for many years, both in newspapers and on television. What is it about data that attracts people like us? I think it may be that no two data sets are the same. There’s always something new to learn, some insight that, when you run a query, makes you go, “Aha!” My point is that data is never boring, contrary to what some might think. Data, at the end of the day, is really about people and how people behave. And people are very interesting. You use the term 'interviewing data'. Do you think of the database as an individual who has an interesting story to tell you? Absolutely, yes. Data has a story that plays out on multiple levels, like a good novel. There’s the origin story – Where did the data come from? Who assembled it? There’s the ongoing narrative that the data has to tell: What trends does it reveal? Then, like most good characters in a novel, the data may have some flaws or limits. Those are important to understand so we know how we can use the data. So, we have to look at data from many angles and ask questions to get to know it well, the same way we’d get to know a neighbor. What is the future of SQL and databases? I can’t say I know exactly, but I will say that SQL has endured for more than 40 years and will likely be around for some time to come. Ten years ago, I heard a lot of people saying that NoSQL databases would make SQL obsolete, and that hasn’t happened. Then I heard people saying that R and Python/pandas would make SQL obsolete. That also hasn’t happened. If anything, products like PostgreSQL have evolved to compete well with NoSQL solutions, and many R and pandas users have discovered how useful it is to integrate SQL databases into their workflow. So, I think it’s safe to say that SQL will be a valuable skill for some time to come. I don’t know that the language itself will evolve dramatically, but I imagine that the engineers who develop SQL databases will continue to find ways to optimize performance to handle the increasingly larger data sets that our data-centric world is producing. What books have you read lately? Can you recommend any books about SQL or other fields? I just picked up PostGIS in Action by Regina O. Obe and Leo S. Hsu. I’m enjoying it quite a bit so far. A book from No Starch, Python One-Liners by Christian Mayer, has been on my shelf and waiting to be read for a few months. And I’m also interested in picking up SQL for Data Scientists by Renee M. P. Teate. Thank you for the interview. This has been a lot of fun! Thank you for having me! To Learn More About SQL Well, now you know what to do to develop your SQL skills. If you are a complete beginner and Anthony’s interview has convinced you to use PostgreSQL, start with the SQL Basics in PostgreSQL course, buy Anthony's book, and learn the basics. Practical SQL: A Beginner's Guide to Storytelling with Data is also a guide to what to learn next. It will be the perfect complement to our SQL from A to Z in PostgreSQL track. Are you interested in spatial data? Check out our PostGIS course – currently one of the few complete and available interactive courses on this cool PostgreSQL extension. Most importantly, keep learning and improving your SQL skills! Read, do online exercises, write queries, and repeat! Keep doing this until you become an expert and achieve your goals! Tags: sql SQL books