Back to articles list May 29, 2017 - 4 minutes read Useful SQL Patterns: Matching Nulls by Masking Nulls Aldo Zelen Aldo is a data architect with a passion for the cloud. From leading a team of data professionals to coding a data warehouse in the cloud, Aldo has experience with the whole lifecycle of data-intensive projects. Aldo spends his free time forecasting geopolitical events in forecasting tournaments. Tags: masking NULLs matching NULLs SQL patterns As you start coding in SQL, you will use some statements and techniques over and over again. We call these “SQL patterns”. This series will look at the most common SQL patterns and consider how to use them. In database development, SQL developers often find themselves returning to the same SQL statements. Learning about these now, early in your SQL journey, will help you work more efficiently. Today, in the first post of this series, we will consider the match by null SQL pattern related to SQL data matching. Why Match by Null? Often, an SQL developer must join two tables based on the values in a certain column (it’s the so-called SQL data matching). For the join to work, these columns must have matching values. For example, suppose we have a poorly-designed database with two tables, movies and genres. Every movie in the movie has a genre (action, comedy, adventure), but some movies have not yet been classified by genre. These records have NULL values in the genres column. In the genres table, there is a matching ID for NULL. So, the movies table has the following columns: movieid, title, and genres. The genres table has two columns, id and genres. Here’s the data in the movies table: movieid title genres 1 Toy Story (1995) 1 2 Jumanji (1995) 1 3 Grumpier Old Men (1995) 2 4 Waiting to Exhale (1995) 2 5 Father of the Bride Part II (1995) 2 6 Heat (1995) 3 7 Sabrina (1995) 2 8 Tom and Huck (1995) 1 9 Sudden Death (1995) 3 10 GoldenEye (1995) And here’s what you’d see in the genre table: id genres 1 Adventure 2 Comedy 3 Action Unknown Let’s say that you need to get a list of all movie titles with their genres, including any movies with unknown genres. Obviously, it makes sense to join the two tables using the genres column. If you were to do a simple join without using popular SQL patterns… SELECT title, g.genres FROM movies m JOIN genres g ON (m.genres = g.id) … You would get results that leave out any movies with an unknown value in the genre row after you’d match the following columns: title genres Tom and Huck (1995) Adventure Jumanji (1995) Adventure Toy Story (1995) Adventure Sabrina (1995) Comedy Father of the Bride Part II (1995) Comedy Waiting to Exhale (1995) Comedy Grumpier Old Men (1995) Comedy Sudden Death (1995) Action Heat (1995) Action GoldenEye (1995) Hmm. No GoldenEye. What went wrong? The Problem with SQL Data Matching by Null Remember, for a join to work, the values in the joining columns must match. But as regular readers of the LearnSQL.com blog know, NULL does not equal NULL. NULL is not the same as NULL! Why not? Sometimes English speakers will use “null” to mean “zero”. But in SQL, NULL does not equal zero. It does not signify “no value”. It is an unknown value. Since we do not know what one NULL represents, we can never say that it is equal to another NULL. Yet, we still need to match records on the basis of NULL values. How can we do this using SQL data matching? Matching by Null with Masking Nulls To get around the fact that we can’t match NULLs, we use the COALESCE statement. This masks all NULL values with a default value : SELECT title, g.genres FROM movies m JOIN genres g ON (COALESCE(m.genres, 0.0) = COALESCE(G.ID, 0.0)) As a result, you match the following columns including the unknown value: title genres Tom and Huck (1995) Adventure Jumanji (1995) Adventure Toy Story (1995) Adventure Sabrina (1995) Comedy Father of the Bride Part II (1995) Comedy Waiting to Exhale (1995) Comedy Grumpier Old Men (1995) Comedy Sudden Death (1995) Action Heat (1995) Action GoldenEye (1995) Unknown The genres column is a NUMERIC type column, so we have to mask NULL values with a number – in this case, “0”. If the column was of another data type, we would mask it with a default value of that data type. For instance, if we were dealing with a CHAR column, we could use an X. To read more about data types, visit the LearnSQL.com Blog. One More Thing About SQL Data Matching and Nulls Matching nulls is usually a quick fix used to get around data modeling mistakes. It’s better to fix your model rather than to work with null masking statements in join conditions. Still, sometimes this cannot be avoided, as with MERGE statements in complex ETL processing. By the way, there was a mistake in the design of this example. It was done on purpose to demonstrate the match by null pattern. Can you spot it? Can you fix it? Tags: masking NULLs matching NULLs SQL patterns You may also like Useful SQL Patterns: Conditional Summarization with CASE When you are using SQL aggregate functions to create report queries, you'll frequently find yourself using conditional summarization with the CASE operator. Read more You Want to Learn SQL? You've Come to the Right Place! If you want to learn SQL basics or enhance your SQL skills, check out LearnSQL.com for a wide range of SQL courses and tracks. Read more Understanding the Use of NULL in SQL Three-Valued Logic NULLs are necessary in databases, learning to use them is fundamental to SQL success. However, NULLs should be handled with care – see how! Read more How to Tackle SQL NULLs: COALESCE function Let's introduce the COALESCE postgresql function. It's inevitable that some data in the database has no value. Find out what then with MySQL. Read more NULL Values and the GROUP BY Clause We've already covered how to use the GROUP BY clause but how does SQL's GROUP BY clause work when NULL values are involved? Find out! Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.