Back to articles list Articles Cookbook
12 minutes read

A Complete Guide to Working With Substrings in SQL

In this article, we’ll discuss how to work with substrings in SQL. It involves learning several approaches, so we’ll show you their ins and outs.

To work with substrings in SQL, we first need to know what a string is. In programming, a string is any kind of textual data (a title, a description, a name). A string consists of any number and type of characters. In a SQL database, strings are typically stored as CHAR or VARCHAR data types.

A substring is part of a string. In other words, a substring is a smaller part of the sequence of characters. If you had the string 'SQL Course', ‘SQL’ would be a substring for that string. We’ve illustrated this in the image below:

Working With Substrings in SQL

Notice that the image (and a string) doesn’t ignore blank space; blank spaces are also characters in a string.

Get Ready to Work with SQL Substrings

Knowing the SQL text functions family is mandatory when working with strings. While we’ll keep the substring examples as simple as possible, we will need text functions.  Using these functions often requires knowing other SQL concepts, such as JOINs, grouping data, filtering data, and using subqueries. You can practice all these concepts in our SQL Practice track. There are eight practice courses with a total of 950+ coding challenges, so your fingers will hurt by the end of this course :)  But that’s kind of the point, as practice is really the only way to become fluent in SQL.

We’ll use the same approach here and show you as many examples as this article permits. When extracting a substring from a string in SQL, the conspicuously-named text function SUBSTRING() seems an obvious choice. However, it’s not the only choice! There are also others, such as the LIKE operator and the LEFT() and RIGHT() text functions.

While this may seem easy, working with SQL substrings is not always straightforward. You have to know the nuances of each function you want to apply, how indexing in SQL works, and how SQL handles textual data. Many data professionals will confirm that working with substrings in SQL can become a painful and frustrating experience!

Well, we’ll make sure it isn’t painful for you. Let’s get started. 

Example Dataset

We’ll work with the table clients. As the name suggests, it’s a list of the company’s clients.

idfirst_namelast_namedate_of_birthemail
1Mr. IsaacGuardiola19.08.1994iguardiola@gmail.com
2Mx. CarryJones19.06.1982cjones@yahoo.com
3Mr. FrankThomas01.01.1994fthomas@yahoo.com
4Ms. MarianneMeijer27.11.1989mmeijer@meijer.com
5Mr. DaveTrotter15.04.1957dtrotter@aol.com

You can create the table by using this query. Note that the date of birth is stored in a text data format.

Let’s begin by extracting substrings!

Check If a String Contains a Substring in SQL

A common way to check if a text contains a certain substring in SQL is by using the LIKE or ILIKE operator. They are used with the WHERE clause to filter data. Both operators will find strings with a substring like the one defined in their pattern. The only difference is that LIKE is case sensitive, while ILIKE isn’t.

Use LIKE and ILIKE to Check If Text Contains a Substring

Your colleague tells you that a client called with a message for you. As the colleague was in a hurry, they forgot to take down the client’s name. The client’s name was Isaac, or Dave, or something like that.

You could try to track down the client named Isaac or Dave by writing this query:

SELECT *
FROM clients
WHERE first_name LIKE 'Dave' OR first_name LIKE 'Isaac';

In this query, you select all the columns from the table. Then you use the WHERE clause and the LIKE operator. You put the names ‘Dave’ and ‘Isaac’ in single quotes in the LIKE operator.  The condition in the WHERE clause can be read as: “Return all the clients with Dave or Isaac as a first name”.

Let’s run the code and see…that there’s absolutely nothing! The code returns no rows. Stupid code! If we look at the table, we see there are clients named Isaac Guardiola and Dave Trotter:

idfirst_namelast_namedate_of_birthemail
1Mr. IsaacGuardiola19.08.1994iguardiola@gmail.com
5Mr. DaveTrotter15.04.1957dtrotter@aol.com

So why doesn't it work? The above query will assume the substring begins with 'Dave' or 'Isaac'. By looking at the table, we can see that the string actually begins with their salutation.

To make that query work, we should use the wildcard '%' with LIKE. This wildcard is used for searching a substring at the start, at the end, and anywhere in between. It all depends on where you position the wildcard. Here’s a short overview of how to use it.

Wildcard & LIKEExplanation
LIKE 'Mr.%'Finds values that start with 'Mr.'
LIKE '%Mr.'Finds values that end with 'Mr.'
LIKE '%Mr.%'Finds values that have 'Mr.' anywhere in the string.

Let’s rewrite the query:

SELECT *
FROM clients
WHERE first_name LIKE '%Dave%' OR first_name LIKE '%Isaac%';

We learned that the wildcards before and after the substring we want to find mean the query will look for this substring anywhere in the string. This lets us bypass the salutation in front of each name.

Here’s the output:

idfirst_namelast_namedate_of_birthemail
1Mr. IsaacGuardiola19.08.1994iguardiola@gmail.com
5Mr. DaveTrotter15.04.1957dtrotter@aol.com

To give you another example of using '%' with LIKE, you can also look for all the misters in our client list.

You do that by placing the wildcard after the substring you’re searching, as the substring starts with 'Mr.':

SELECT *
FROM clients
WHERE first_name LIKE 'Mr.%';

Here’s the output.

idfirst_namelast_namedate_of_birthemail
1Mr. IsaacGuardiola19.08.1994iguardiola@gmail.com
3Mr. FrankThomas01.01.1994fthomas@yahoo.com
5Mr. DaveTrotter15.04.1957dtrotter@aol.com

You probably noticed that we write the substrings in LIKE exactly as they are written in the table regarding capitalization. It’s because the LIKE operator is case sensitive.

If you want to avoid that or are unsure how the data is formatted, you can use the ILIKE operator. As it is case insensitive, you can write the substring any way you want:

SELECT *
FROM clients
WHERE first_name ILIKE '%dave%' 
OR first_name ILIKE '%iSAaC%';

The output is the same as with LIKE:

idfirst_namelast_namedate_of_birthemail
1Mr. IsaacGuardiola19.08.1994iguardiola@gmail.com
5Mr. DaveTrotter15.04.1957dtrotter@aol.com

You can learn more about LIKE (and NOT LIKE) in this article.

The SUBSTRING() Function

SUBSTRING() is an SQL function used for extracting substring from a string. It allows you to specify the string from which you want to extract the substring. You define the substring by specifying its starting position within the string and its length.

Here’s the function's syntax:

SUBSTRING(string, start_position, length)

Like any SQL function, SUBSTRING() takes arguments in its parentheses. Two arguments are mandatory:

  • string – The string you want to search; it can be a constant or a column/expression.
  • start_position – An integer value defining the position (the ordinal number of the character) at which the substring starts, i.e. the position of the substring’s first character.

The third argument is length, which is optional. It defines the length of a substring as the number of characters it will contain. When this argument is omitted, the substring will be the remainder of the string, counting from the start position.

Let’s see some examples to clarify all this.

Use SUBSTRING() to Extract a Substring from a Constant String

A constant string is any string you write in single quotes as the first argument in SUBSTRING().

For example, the constant string in the following code is 'SQL Practice Course', and we want to extract 'Practice Course'.

SELECT SUBSTRING ('SQL Practice Course', 5) 
AS substring_extract

The integer 5 is the start_position argument. It means the substring starts from the string’s fifth character.

Working With Substrings in SQL

Why choose 5 if there are only three letters ('SQL') in front of our substring? Shouldn’t the substring start at the fourth position? Be careful: The blank space also counts as a character in a string!

As we omitted the length argument, the code will return the rest of the string as a substring. In other words, the code returns this result:

substring_extract
Practice Course

Use SUBSTRING() to Extract a Substring from a Column

Instead of a constant value, you can also specify a column from which you want to extract a substring.

Let’s show the client’s first and last name but without the salutation (Mr./Ms./Mx.):

SELECT SUBSTRING (first_name, 5) AS first_name,
	 last_name
FROM clients;

We again use the SUBSTRING() function. The first argument is the column’s name, which is first_name. The second argument is the starting position of the substring, which is five. Remember to count the blank space! There’s one after the salutation, so 'Mr. ' consists of four characters. The fifth character is the first letter of the client’s name.

We omitted the substring length, and the second selected column is the client’s last name, so the code returns this:

first_namelast_name
IsaacGuardiola
CarryJones
FrankThomas
MarianneMeijer
DaveTrotter

Use SUBSTRING() to Extract a Substring of a Defined Length

We’ll now stop avoiding the third SUBSTRING() argument. In this example, we’ll show how to extract the clients’ day and month of birth:

SELECT first_name,
	 last_name,
	 SUBSTRING (date_of_birth, 1, 6) AS birthday
FROM clients;

After selecting the clients’ first and last names, we use the SUBSTRING() function. We’ll use the column date_of_birth to extract the client’s birthday (month and day).

After passing the column as the argument, we specify the starting position. This is number 1, which means the substring will be extracted from the string’s first character. The number 6 means the substring will consist of six characters: two characters for a day, a dot, two characters for a month, and another dot.

Take a look at the result:

first_namelast_namebirthday
Mr. IsaacGuardiola19.08.
Mx. CarryJones19.06.
Mr. FrankThomas01.01.
Ms. MarianneMeijer27.11.
Mr. DaveTrotter15.04.

All the clients are there, along with their birthdays extracted the way we wanted. We don’t mind showing the salutation this time.

Using SUBSTRING() with Other Functions to Find a Substring Index

There are functions that can be used for finding an index within the string. In MySQL and PostgreSQL, this function is called POSITION(); in SQL Server it’s called CHARINDEX(). You can specify the substring you’re looking for and these functions will return its position within the string.

These functions are extremely helpful in combination with SUBSTRING(). Let’s discuss POSITION() first, then we’ll see an example with CHARINDEX().

The POSITION() function allows you to specify the substring and it will find its starting position. Let’s use it to extract each client’s username from their email address:

SELECT first_name,
	 last_name,
	 email,
	 POSITION('@' IN email) AS at_position,
	 SUBSTRING(email, 1, POSITION('@' IN email)-1) AS username
FROM clients;

Above, we use POSITION() to find the '@' in each user’s email address. You need that because everything that comes before that character is the user’s name. The syntax for POSITION() is simple: just specify the search character in single quotes, then follow it with the keyword IN and the name of the columns where you want to look.

This column only returns the position of the character. We need to embed it in SUBSTRING() to get the username. The first two arguments in SUBSTRING() are familiar: we’re looking at the column email, and we want our substring (a username) to start with the first character of the string.

The third argument in SUBSTRING() is the length of the substring. The length of the username is the number of characters before and excluding '@'. In other words, the length of the username is the position of '@' minus one. Why minus? Because otherwise, '@' would be part of the username.

first_namelast_nameemailat_positionusername
Mr. IsaacGuardiolaiguardiola@gmail.com11iguardiola
Mx. CarryJonescjones@yahoo.com7cjones
Mr. FrankThomasfthomas@yahoo.com8fthomas
Ms. MarianneMeijermmeijer@meijer.com8mmeijer
Mr. DaveTrotterdtrotter@aol.com9dtrotter

We can check the result. In the 'iguardiola@gmail.com', the '@' symbol is at position 11, as shown in the column at_position. The column username shows the substring before '@', which is indeed 'iguardiola' for this client. You can check the rest of the results the same way.

The same code in SQL Server should use the function CHARINDEX():

SELECT first_name,
	 last_name,
	 email,
	 CHARINDEX('@', email) AS at_position,
	 SUBSTRING(email, 1, CHARINDEX('@', email)-1) AS username
FROM clients;

The logic is similar to the POSITION() function: you specify the substring you’re looking for and then the column, with the arguments separated by a comma.

The code output is the same.

We’re now finished with SUBSTRING(). But if you need more practice, here are five more SUBSTRING() examples.

LEFT() and RIGHT() in SQL

Another way to extract a substring is to use the LEFT() or RIGHT() functions.

LEFT() extracts the substring from the left, allowing us to define the length of the substring. The RIGHT() function does the same thing but from the right side.

Use LEFT() And RIGHT() to Extract a Substring

Let’s use these two functions to extract some substrings: the first three letters of a last name and the year of birth.

Here’s the code.

SELECT first_name,
	 last_name,
	 LEFT(last_name, 3) AS last_name_substring,
	 RIGHT(date_of_birth, 4) AS year_of_birth
FROM clients;

We use the LEFT() function to get the first three letters of the last name. It starts from the left and takes the specified number of characters to create a substring. The syntax is simple: we specify the column name and then the length of the substring.

RIGHT() is used to find the year of birth.  It starts from the right and takes the specified number of characters to create a substring. Here, the first argument is the date_of_birth column. Then we count the number of characters we want from the right, which is four – this will return the birth year.

As LEFT() and RIGHT() are basically mirror images of each other, we could have easily used either of these functions to find both required substrings.

Here’s the output of the above query:

first_namelast_namelast_name_substringyear_of_birth
Mr. IsaacGuardiolaGua1994
Mx. CarryJonesJon1982
Mr. FrankThomasTho1994
Ms. MarianneMeijerMei1989
Mr. DaveTrotterTro1957

It shows the first three letters of the last name and the year of birth.

Become the One That Pulls All the (Sub)Strings in SQL

As a data analyst, you’re expected to pull all the (sub)strings by yourself. One of the tricks to achieve that is to know various ways of working with substrings, like the ones we showed you here.

Knowing different functions for handling substrings allows you to choose the one that suits your problem best. Not only will it get you the result you want – you’ll also end up with more efficient code and fewer lines.

To get to that level, you must work with strings a lot and use all the mentioned functions daily. In other words, you need to write code. If you need a safe environment to do that, our SQL Practice course offers exactly that.

Good luck with becoming the master of SQL substrings!