7th Jul 2022 12 minutes read An Overview of SQL Text Functions in MySQL Tihomir Babic SQL text functions MySQL Table of Contents What Is a Function, Anyway? MySQL Text Function Examples Sample Data CONCAT() CHAR_LENGTH() and LENGTH() UPPER() & LOWER() SUBSTRING(), RIGHT(), and LEFT() REPLACE() TRIM() REVERSE() LOCATE() Time to Practice MySQL Text Functions! Do you work in MySQL as a data analyst? Then you should know how to use its text functions – data analysts don’t work only with numbers. Reporting requires computations, data classification, and label creation, and you can do all of that in MySQL. In this article, I’ll explain several of the most common and useful MySQL text functions by discussing what they do and showing how they do it. In a way, this will be a condensed version of our Common Functions in MySQL course. It’s a great resource for intermediate users who want to become proficient in MySQL functions. The course uses MySQL 8, but anyone using MySQL 5 and above can take it; the functions available only in MySQL 8 are clearly marked. The course covers four topics: Text functions. Numeric functions. Date and time functions in MySQL. NULL functions. Now, let’s get started by defining MySQL functions. What Is a Function, Anyway? Function this, function that. But what is a function in SQL? It’s a named and predefined piece of code that you can use in your query. How does this translate to SQL syntax? MYSQL_FUNCTION(argument1, argument 2, …) The name of this example function is MYSQL_FUNCTION. The arguments (enclosed in parentheses) tell the function what to do. MySQL Text Function Examples MySQL’s built-in text functions are important; they let you write queries effectively when you’re dealing with text (e.g. string, varchar) values. Some of these are standard SQL functions, so you can use the Standard SQL Functions Cheat Sheet to follow the article better. First, let’s look at the sample data. Sample Data Below is a table for a car dealer network in France. It contains basic info about showroom cars at every dealer. The table is named showroom_cars and it contains these columns: id – The customer’s ID. make – The car’s manufacturer. model – The car model. serial_number – The car’s serial number. made_in – The country where the car was manufactured. dealer – The name of the dealer that has the car. Here’s a look at the data: idmakemodelserial_numbermade_indealer 1RenaultClio2MRKL32020t894509###France###Fred Cars, Nantes 2FordFocus5irpo52022D568498###Germany###Auto Jacques, Bordeaux 3FordFiesta7kOhj71999T456055###Spain###Fred Cars, Nantes 4RenaultTwingo9weŽT92018t489337###Slovenia###Paris Cars, Paris 5MazdaCX-36PPRO42018M774848###Japan###Top Car, Marseille Now, let’s start exploring the MySQL text functions. CONCAT() CONCAT() combines the arguments into one string. When you have several separate texts, this is the function to use if you want to turn them into one text. A car’s make and model are ideal for this function. You want to show them in one column, with the make and model separated by space. This is how you’d do it: SELECT id, CONCAT(make, ' ', model) AS car_name FROM showroom_cars; The first argument in CONCAT() is the column make. The space is also a character, so the query states it explicitly as the second argument in the single quotation marks. The third argument is the car model. That’s how you get a list of all the showroom cars. idcar_name 1Renault Clio 2Ford Focus 3Ford Fiesta 4Renault Twingo 5Mazda CX-3 CHAR_LENGTH() and LENGTH() Both functions return the length of the string that is the function’s argument. However, they are not the same functions. CHAR_LENGTH() returns the string length as the number of characters; LENGTH() returns the length in bytes. A car’s serial number has to be 17 characters long. If it’s not, there’s some mistake in the data. Let’s use both CHAR_LENGTH() and LENGTH() to see if they return different results. SELECT id, serial_number, CHAR_LENGTH(serial_number) AS serial_number_length_char, LENGTH(serial_number) AS serial_number_length_bytes FROM showroom_cars; The query selects the car ID and the serial number. Then it checks the serial number’s length. The argument in the CHAR_LENGTH() and LENGTH() functions is the column serial_number. The query will return this: idserial_numberserial_number_length_charserial_number_length_bytes 12MRKL32020t8945091717 25irpo52022D5684981717 37kOhj71999T4560551717 49weŽT92018t4893371718 56PPRO42018M7748481717 All the serial numbers are OK because the column serial_number_length_char shows they are all 17 characters long. Is the result the same when it comes to length in bytes? It seems it is…oh, no, it’s not! Look at row four! It says the serial number is 18 bytes long, instead of 17, like all the other serial numbers. But the previous column clearly shows this serial number consists of 17 characters. What went wrong? Nothing. If you inspect this serial number a little closer, you’ll see the fourth character is the letter ‘Ž’. This letter is a Unicode character, which is usually encoded in two bytes. Thus, 16 characters in this serial number are taking up one byte each and the Unicode character takes two bytes. It’s correct to say the serial number’s length is 17 characters but 18 bytes. UPPER() & LOWER() The UPPER() function takes the text and turns all the letters into uppercase. LOWER() does the opposite: all the letters will become lowercase. A closer inspection of data in the column serial_number shows that letters are sometimes the upper case and sometimes the lower case. It shouldn’t be this way! The people in charge still don’t know which way is correct, but they are sure that all the letters must be either uppercase or lowercase. They asked you to provide them with both versions. You need the UPPER() and LOWER() functions to give them what they want: SELECT id, make, model, serial_number, UPPER(serial_number) AS serial_number_uppercase, LOWER(serial_number) AS serial_number_lowercase FROM showroom_cars; Here we select the columns id, make, model, and serial_number. Then we use the column serial_number as an argument in both the UPPER() and LOWER() functions. The serial numbers are now formatted in the desired way: idmakemodelserial_numberserial_number_uppercaseserial_number_lowercase 1RenaultClio2MRKL32020t8945092MRKL32020T8945092mrkl32020t894509 2FordFocus5irpo52022D5684985IRPO52022D5684985irpo52022d568498 3FordFiesta7kOhj71999T4560557KOHJ71999T4560557kohj71999t456055 4RenaultTwingo9werT92018t4893379WERT92018T4893379wert92018t489337 5MazdaCX-36PPRO42018M7748486PPRO42018M7748486ppro42018m774848 SUBSTRING(), RIGHT(), and LEFT() The SUBSTRING() function will return a substring of a string. In other words, it returns part of the text that is the function’s argument. The basic syntax is to define the argument and then the position of the substring, i.e. the character from which the string begins. The position is an integer, and counting starts from the left. MySQL gives you some additional possibilities: You can define the start of the substring and its length in the number of characters You can use a negative integer as an argument to give the position of a substring from the right. The RIGHT() function is a type of a SUBSTRING() function, but it works only from the right. In other words, it returns the defined number of the characters counting from the right. You can define the length of the output. LEFT() is the opposite: it returns the number of characters from the left. For a SUBSTRING() example, imagine that you have to extract a car’s year of manufacture from its serial number. Oh, yeah, it’s possible! The four numbers starting from the seventh character are the manufacture year. Just to make things a bit trickier, you also have to show the serial number's first and last six characters. The following query uses SUBSTRING() in two ways: once counting characters from the left and once from the right. Then it uses the LEFT() and RIGHT() functions to get the six beginning and ending characters: SELECT id, make, model, serial_number, SUBSTRING(serial_number, 7, 4) AS year_produced_left, SUBSTRING(serial_number, -11, 4) AS year_produced_right, LEFT(serial_number, 6) AS serial_number_six_left, RIGHT(serial_number, 6) AS serial_number_six_right FROM showroom_cars; The first SUBSTRING() in the above query counts from the left. The first argument is the column serial_number; this is the string you’ll extract the substring from. The first integer defines the start of the substring. The manufacture year takes up the seventh to eleventh characters; therefore, seven is the first integer. The second integer, four, is the length of the substring; the manufacture year consists of four characters. The second SUBSTRING() does the same but from the right. The first argument is the same. The second one is the integer -11 because the manufacture year starts from the eleventh character if counted from the right. The length of the year is again four characters. In the LEFT() function there is, again, the serial number. The integer six means the output will be six characters from the left. The arguments are the same in the RIGHT() function. idmakemodelserial_numberyear_produced_leftyear_produced_rightserial_number_six_leftserial_number_six_right 1RenaultClio2MRKL32020t894509202020202MRKL3894509 2FordFocus5irpo52022D568498202220225irpo5568498 3FordFiesta7kOhj71999T456055199919997kOhj7456055 4RenaultTwingo9weŽT92018t489337201820189weŽT9489337 5MazdaCX-36PPRO42018M774848201820186PPRO4774848 As you can see, both uses of SUBSTRING() give the same result. REPLACE() The REPLACE() function replaces one or more characters in the text with other character(s). Therefore, the function allows you to define the string, the character that is to be replaced, and the character to be replaced with. In the example data, the dealer's name and location are separated by a comma. You need to remove the comma because it’s there by mistake; the location is part of the dealer name. SELECT dealer, REPLACE(dealer, ',', '') AS dealer_formatted FROM showroom_cars; Here is the result: dealerdealer_formatted Fred Cars, NantesFred Cars Nantes Auto Jacques, BordeauxAuto Jacques Bordeaux Fred Cars, NantesFred Cars Nantes Paris Cars, ParisParis Cars Paris Top Car, MarseilleTop Car Marseille The first column is the current string in the dealer column, shown as it was originally entered. The second column shows the corrected version. The formatting is done using REPLACE(). The first argument is the column dealer. The second is a comma in the single quote; this is the replaced character. It is replaced by nothing, with nothing written in the single quote. There really are no commas anymore. TRIM() This is a function that returns the string without the specified leading or trailing characters (or both). How can you use TRIM()? In our example, the manufacturing country in the example data is not shown clean. There are some characters that should be removed. I’ll show you all three ways of using the TRIM() function. Here’s the query: SELECT id, make, model, made_in, TRIM(BOTH '#' FROM made_in) AS country_formatted, TRIM(LEADING '#' FROM made_in) AS country_remove_leading, TRIM(TRAILING '#' FROM made_in) AS country_remove_trailing FROM showroom_cars; The first TRIM() function removes both leading and trailing characters. This is indicated by using the keyword BOTH. After this keyword, you define the character that you want to remove. Then the FROM keyword is used to define which string the character will be removed from. The other two TRIM() functions are almost the same. The only difference is that the LEADING keyword means removing the leading characters, i.e., the ones appearing before the string. TRAILING will remove the trailing characters, i.e., the ones after the string. And the results you can see for yourself: idmakemodelmade_incountry_formattedcountry_remove_leadingcountry_remove_trailing 1RenaultClio###France###FranceFrance######France 2FordFocus###Germany###GermanyGermany######Germany 3FordFiesta###Spain###SpainSpain######Spain 4RenaultTwingo###Slovenia###SloveniaSlovenia######Slovenia 5MazdaCX-3###Japan###JapanJapan######Japan REVERSE() The REVERSE() function reverses the string so that the first character becomes last and vice versa. In other words, it shows the text backwards. An inspection of our data showed that the serial number was entered incorrectly. How it happened, nobody knows. But there are some indications that the angry data analyst who worked there before you messed with data before leaving the company. The word is, it was revenge for not getting a pay raise for five years. Now you have to reverse the serial numbers again. This is not difficult, because you’re a data analyst and you know the REVERSE() function. SELECT id, make, model, serial_number, REVERSE(serial_number) AS serial_number_corrected FROM showroom_cars; The REVERSE() function is simple to use. All you need to do is specify the text, and it will do the rest. idmakemodelserial_numberserial_number_corrected 1RenaultClio2MRKL32020t894509905498t02023LKRM2 2FordFocus5irpo52022D568498894865D22025opri5 3FordFiesta7kOhj71999T456055550654T99917jhOk7 4RenaultTwingo9weŽT92018t489337733984t81029TŽew9 5MazdaCX-36PPRO42018M774848848477M81024ORPP6 LOCATE() Use this function when you want to find the location of a particular character. It will be returned as an integer, specifying the number of characters from the start of the string. For example, if you want to locate the comma in the column dealer, this is what you’d use: SELECT dealer, LOCATE(',', dealer) AS comma_location FROM showroom_cars; In this example, the first argument in the function is the comma specified in the single quote. The second argument is the column dealer. Here are the commas’ positions: dealercomma_location Fred Cars, Nantes10 Auto Jacques, Bordeaux13 Fred Cars, Nantes10 Paris Cars, Paris11 Top Car, Marseille8 How can this function be more useful? Well, you can use it with other functions, such as CHAR_LENGTH() and RIGHT(). For example, we can use it with the dealer column to show only the dealer’s city: SELECT dealer, LOCATE(',', dealer) AS comma_location, CHAR_LENGTH(dealer) AS dealer_length, RIGHT(dealer, CHAR_LENGTH(dealer)-LOCATE(',', dealer)-1) AS dealer_town FROM showroom_cars; The columns comma_location and dealer_length are there only to make it easier to understand how showing the dealer’s town works. They return the location of the comma (same as in the above example) and the length of the string. Now, the RIGHT() function also takes the string from the column dealer. The second argument should be the output string length. And it is, but it’s comprised of the CHAR_LENGTH() and LOCATE() functions. The string length is the difference between these two functions’ output. If you want to see the dealers’ cities only, you need to output the characters that come after the comma. The thing is, the number of characters before the comma is different for every dealer. That’s why the output string in the RIGHT() function will be the total length of the dealer name minus the number of characters before the comma. This difference is the number of characters that appear after the comma, i.e. the dealer’s city. Why then the minus one in the RIGHT() function? Because after a comma comes the space, so deducting one character accounts for that. dealercomma_locationdealer_lengthdealer_town Fred Cars, Nantes1017Nantes Auto Jacques, Bordeaux1322Bordeaux Fred Cars, Nantes1017Nantes Paris Cars, Paris1117Paris Top Car, Marseille818Marseille The number of characters in every city name should be the difference between the dealer_length and comma_location columns – which use the functions CHAR_LENGTH() and LOCATE(), respectively – minus one. Let’s check it. There are six characters in ‘Nantes’, which the mathematics corroborates: 17-10-1 = 6. The same is true with the other rows; you can check if you want. If these examples have only activated your appetite for the SQL text functions, don’t worry; there are more examples in our article An Overview of SQL Text Functions. If you want to read about MySQL date functions, check out our in-depth-guide. Time to Practice MySQL Text Functions! I tried to make these examples of MySQL text functions as relatable as possible; now you understand how these functions work. However, nothing beats writing code yourself in the Common Functions in MySQL course, so use it to develop what you learned here. If that leads to taking the complete SQL learning path or any other MySQL courses, even better for you! Tags: SQL text functions MySQL