Back to articles list October 20, 2016 - 7 minutes read How to Tackle SQL NULLs: COALESCE function Patrycja Dybka Junior Java Developer @ e-point Tags: null null-values It's inevitable that some data in the database has no value what in SQL is represented by the NULL keyword. "No value" here is different from zero, false, or an empty string (but with exceptions! In Oracle database, NULL is the same as string of zero length). During playing with the database, treating NULLs are becoming more and more troublesome, therefore >MySQL standard provides help with some of the functions, like COALESCE. Let's introduce the COALESCE() function: Definition COALESCE function returns a first non NULL argument from the passed list of arguments. Its syntax is as follows: COALESCE(x, y, … , n) What additionally is important, that: It takes at least two arguments. The expressions in the COALESCE function must evaluate to the same data type (For example the SQL statement SELECT COALESCE (1, 'aa', NULL, 23); will produce an error). If all listed arguments evaluate to NULL, the function also returns NULL. COALESCE is essentially a shorthanded CASE statement presented below: CASE WHEN x IS NOT NULL THEN x WHEN y IS NOT NULL THEN y WHEN ... IS NOT NULL THEN … ELSE n END COALESCE – usage examples Now, take a look at the presented below prototype of the university database model. I will show some examples that requires the COALESCE function that refers to tables from this model. Using COALESCE with value concatenation Take a look at the student table from the introduced university database model: It may happen, that the particular person could not have the the middle name, therefore some records in the middle_name column may be NULLs like show the example data below. Let's try to concatenate student's first name, middle name, last name. This one works in PostgreSQL, Oracle. SELECT first_name || ' ' || middle_name || ' ' || last_name AS full_name FROM student; The result table contains a column full_name with the concatenated strings. Notice that only one record has a proper full name. The rest of them are NULLs, because the middle name was also NULL (while concatenating strings with NULL value, the result is also NULL). Such a result doesn't looks very nice. The result don't show the full name of students with no middle name. Here the COALESCE() function comes with the rescue. Using this we can replace NULL fields with for example an empty string. Now, the new one query looks as follows: SELECT first_name || COALESCE(' ' || middle_name || ' ', ' ') || last_name AS full_name FROM student; The result is now complete 🙂 MySQL, MS SQL Server uses for the concatenation CONCAT() function or ‘+‘ operator. Similarly, the query in MySQL or MS SQL Server will look as follows: SELECT CONCAT(first_name, ' ', middle_name, ' ', last_name) FROM student; … and the records that have NULLs will also have : With the COALESCE function, the NULLs that appear in the middle_name column will be replaced. SELECT CONCAT(first_name, ' ', COALESCE(middle_name, ''), ' ', last_name) FROM student; Using PostgreSQL COALESCE with aggregate functions Well, NULLs indeed can appear very problematic. Let's imagine different situation. I want to get to know what's the average grade for the course. It seems very easy, right? But, wait, wait… even if this is a simple query we can encounter some drawbacks. The issue concerns some of the SQL aggregate functions like SUM(), MAX() or MIN(). EXAMPLE: Using COALESCE with AVG function Let's try an example with student_courses table. As you see below, students who are registered to the course of id 1 don't have a grade yet. The query, that returns the average grade for the course looks as follows: SELECT course_instance_id, AVG(grade) AS average_grade_in_course FROM student_courses GROUP BY course_instance_id; For the AVG() function , the non-NULL fields are summed and the sum is divided by the number of non-NULL fields. So, if all the records have NULL in grade column, the average grade will be NULL as well. The result table below: In such a case, we want to insert other value, like 0 (the grade can be a value from 2 to 5, so the average 0 can indicate for us that students don't have grades) COALESCE() function comes with help in replacing a values. The same query with COALESCE will be: SELECT course_instance_id, COALESCE(AVG(grade), 0) AS average_grade_in_course FROM student_courses GROUP BY course_instance_id; And now we can get the result table as: EXAMPLE: Using COALESCE with SUM() function This one is a similar example. Let's assume different situation. We want to count the total amount of the absence hours for each student. Take a look at selected part of the model: The data in the table student_course_attendance looks as follows. While the data student_courses is: Take a look at the data in those tables closely. As you see, two students are registered for the particular course, while only the one was missing classes. The query, that will calculate the sum of the total number of the absence hours for a course_id = 1 will for each student is: SELECT student_id, SUM(absence_hours) AS total_absence_hours FROM student_courses sc LEFT JOIN student_course_attendance sca ON sc.id = sca.student_courses_id WHERE course_instance_id = 1 GROUP BY 1; … and evaluates in a result table: The total absence hour was evaluated to NULL, because indeed, this student didn't miss any classes in this course instance. There weren't any record for this student in the student_course_attendance table, and the SUM() function returned NULL. We can secure from this situation of course using COALESCE function. The new query is as follows: SELECT student_id, COALESCE(SUM(absence_hours), 0) AS total_absence_hours FROM student_courses sc LEFT JOIN student_course_attendance sca ON sc.id = sca.student_courses_id WHERE course_instance_id = 1 GROUP BY 1; And the result is now: Let's abandon now the university model and look at the different example. Using COALESCE in creating PIVOT tables This use case will be presented for the PostgreSQL database. Note, that other databases like Oracle and MS SQL Server have their equivalent functions for creating pivot tables. Look at the table below. This is the result of some query, that shows what is the total sale for each brand in the particular month. I want to transponse the rows with columns (create a pivot table). The result table should have columns: brand_id, jan, feb, …, dec. Notice that not in the every month product from the particular brand was sold. Let's take a look at the Postgres example. SELECT brand_id, jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec FROM CROSSTAB ( 'SELECT year, month, qty FROM sales ORDER BY 1', 'SELECT m FROM generate_series(1,12) m' ) AS ( year int, "jan" int, "feb" int, "mar" int, "apr" int, "may" int, "jun" int, "jul" int, "aug" int, "sep" int, "oct" int, "nov" int, "dec" int ); After executing such a query, we receive: Unfortunately, the sell is recorded only in some of the months and only in those fields, the number of total sales is pasted. As you probably expected, the rest of the fields contains NULLs. To get rid of the unwanted NULLs we can use the COALESCE function, that when needed insert a 0 instead of NULL. The repaired query looks as follows: SELECT brand_id, COALESCE(jan, 0), COALESCE(feb, 0), COALESCE(mar, 0), COALESCE(apr, 0), COALESCE(may, 0), COALESCE(jun, 0), COALESCE(jul, 0), COALESCE(aug, 0), COALESCE(sep, 0), COALESCE(oct, 0), COALESCE(nov, 0), COALESCE(dec, 0) FROM CROSSTAB( 'SELECT year, month, qty FROM sales ORDER BY 1', 'SELECT m FROM generate_series(1,12) m' ) AS ( year int, "jan" int, "feb" int, "mar" int, "apr" int, "may" int, "jun" int, "jul" int, "aug" int, "sep" int, "oct" int, "nov" int, "dec" int ); Summary NULLs can make life problematic. If you haven't experienced the bad side of the missing values yet, you will come across it for sure. It's a matter of time. For now, remember the COALESCE function, that will help you tackle the unwanted NULLs. Tags: null null-values You may also like Useful SQL Patterns: Matching Nulls by Masking Nulls Today, in the first post of the SQL patterns series, we will consider the match by null pattern. It’ll help you deal with tables containing null values. Read more Understanding the Use of NULL in SQL Three-Valued Logic NULLs are necessary in databases, learning to use them is fundamental to SQL success. However, NULLs should be handled with care – see how! Read more 18 Useful Important SQL Functions to Learn ASAP Almost every company needs someone with knowledge of SQL. Structured Query Language is commonly used for business intelligence (BI), analysis, computation. Read more The SQL Coalesce Function: Handling Null Values Though the SQL coalesce function may seem complex, it’s actually very straightforward. Let’s look at how it can be used to work with NULL values. Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.