Back to articles list Articles Cookbook
8 minutes read

The SQL Substring Function in 5 Examples

Working with text data in SQL? We explain how to get values from any point in a string.

When you think of working with data in SQL, your first thought is probably a database full of numbers and your SQL code doing very fancy calculations. But text is data, too! It’s very common to find text data in databases. Not only do you have to extract it, but often you also have to manipulate it. The functions that let you do so are called text functions.

They are an essential topic for anyone learning or using SQL. The text functions, along with the numeric and date-and-time functions, are the main focus of our Standard SQL Functions interactive course. In addition, it teaches you how to treat NULL values in databases, aggregate data, and use CASE WHEN. You only need an internet connection and a basic understanding of SQL to do the 211 interactive exercises the course offers.

One of the common text functions the course covers is SUBSTRING(). In this article, we have five real-life business examples that cover the main uses of this function. Some examples may feel complicated if you’re not familiar with the text functions, so make sure you have the Standard SQL Functions cheat sheet or an overview of SQL text functions by your side.

What Is the SUBSTRING() Function?

SUBSTRING() is a text function that allows you to extract characters from a string. Its syntax is

SUBSTRING(expression, start, length)

For the expression argument, you write a string literal or specify a column from which you want to extract the substring. The start argument is an integer indicating the numeric position of the character in the string where the substring begins. The length argument, as the name says, defines the length, an integer value, of the substring to be returned.

How Does SUBSTRING() Work?

The clue is in the function’s name itself. A substring is a string within the main string. Therefore, SUBSTRING() extracts a substring as you specify in its argument.

It works like this:

sql substring function

In the string above, the substring that starts at position 1 and has a length of three characters is ‘STR’.

Now that we have the principles covered, let me show you several examples. Starting, of course, with the simplest one!

Example 1: Substring From a String Literal

The SUBSTRING() function returns a substring from any string you want. You can write the string explicitly as an argument, like this:

SELECT SUBSTRING('This is the first substring example', 9, 10) AS substring_extraction;

This means: I want to find a substring from the text ‘This is the first substring example’. The arguments say that the substring starts at the 9th character of the string and that its length is 10 characters.

Let’s see what this code returns:

substring_extraction
the first

There’s one column and one row. The substring extracted is ‘the first’. This is the most basic use of SUBSTRING(); the code doesn’t even use any tables!

The Employees Table

To show you more interesting examples, I need some data. Let me introduce you to a table named employees.

The table stores information about the employees of an imaginary company Kooler in the following columns:

  • id – The employee’s ID.
  • first_name – The employee’s first name.
  • last_name – The employee’s last name.
  • email – The employee’s email.
  • job_title – The employee’s job title.
  • department – The employee’s department.
  • start_date – The employee’s start date at Kooler.

Here are the first several rows for you to get a sense of the data:

idfirst_namelast_nameemailjob_titledepartmentstart_date
1ClarenceWilkinsoncwilkinson@kooler.comJunior Sales AssistantSales09/2021
2MirandaBrownmbrown@kooler.comSenior Sales SpecialistSales01/2020
3FrankDrebinfdrebin@kooler.comJunior Sales ManagerSales08/2019

Example 2: Substring From a Column

As you can imagine, writing the string expression explicitly is not the only way to use SUBSTRING(). You can also use it on a column of a table.

Here’s one such example. I want to find the initials of all employees. I use the column email since I know the first two letters of the email address are the initials:

SELECT first_name,
	 last_name,
	 email,
	 SUBSTRING(email, 1, 2) AS employee_initials
FROM employees;

I specify the column email in the function. Getting the first two letters from the email address means the substring starts at the first character for a length of two characters. This returns the desired result:

first_namelast_nameemailemployee_initials
ClarenceWilkinsoncwilkinson@kooler.comcw
MirandaBrownmbrown@kooler.commb
FrankDrebinfdrebin@kooler.comfd
VivienKellyvkelly@kooler.comvk
SteveStephenssstephens@kooler.comss
NastassjaHarrisonnharrison@kooler.comnh
ThomasPetersontpeterson@kooler.comtp
MathildeKinskimkinski@kooler.commk
MateuszWozniakmwozniak@kooler.commw
AineDoyleadoyle@kooler.comad
LorenzoAlfieriaalfieri@kooler.comaa
PetraBabićpbabic@kooler.compb
DuarteSimoesdsimoes@kooler.comds
OlenaKostenkookostenko@kooler.comok
LaurensGrotenhuislgrotenhuis@kooler.comlg

Example 3: Substring Without the Length Argument

You can omit the length argument in SUBSTRING(), and the function still works. A good example is when you want to show only the year of the employment start date. You see, the column start_date is a little unfriendly for that. This date is written as text data in the MM/YYYY format.

Fortunately, SUBSTRING() solves this problem:

SELECT first_name,
	 last_name,
	 start_date,
	 SUBSTRING(start_date, 4) AS start_year
FROM employees;

To get the year from the column start_date, defining the start of the substring is enough. In this code, the substring starts from the fourth character. Since I omit the length argument, the length of the substring is however long it is to the end of the string from the fourth character. This is how I easily get the year, as you see below:

first_namelast_namestart_datestart_year
ClarenceWilkinson09/20212021
MirandaBrown01/20202020
FrankDrebin08/20192019
VivienKelly03/20192019
SteveStephens07/20212021
NastassjaHarrison03/20222022
ThomasPeterson01/20222022
MathildeKinski01/20222022
MateuszWozniak01/20222022
AineDoyle10/20212021
LorenzoAlfieri10/20212021
PetraBabić05/20212021
DuarteSimoes04/20202020
OlenaKostenko11/20192019
LaurensGrotenhuis06/20172017

Example 4: POSITION() and CHARINDEX()

Back to working with emails. By company policy, the local point of an email address (i.e., the part before ‘@’) is also the employee’s username for logging into all the business applications. You need to extract this username. Here’s how:

SELECT first_name,
       last_name,
	 SUBSTRING (email, 1, POSITION('@' IN email)-1) AS username
FROM employees;

The first two arguments are what you have seen already. I want to extract a substring from the column email, and I want it to start from the first character of the string. But now, the length of the substring is different for every employee. How do I tell the function to return all characters before the ‘@’ sign?

I use POSITION(), which is equivalent to CHARINDEX() in SQL Server or MySQL. It locates the specified character in the string and returns its numeric character position. So, the length of the substring that is the employee’s username is equal to POSITION('@' IN email)-1. Why minus one? Because I don’t want ‘@’ to be included in the employee’s username.

This is the result:

first_namelast_nameusername
ClarenceWilkinsoncwilkinson
MirandaBrownmbrown
FrankDrebinfdrebin
VivienKellyvkelly
SteveStephenssstephens
NastassjaHarrisonnharrison
ThomasPetersontpeterson
MathildeKinskimkinski
MateuszWozniakmwozniak
AineDoyleadoyle
LorenzoAlfieriaalfieri
PetraBabićpbabic
DuarteSimoesdsimoes
OlenaKostenkookostenko
LaurensGrotenhuislgrotenhuis

Example 5: LENGTH() + POSITION()

The final example shows you how to find an employee’s job position from the data. Working at Kooler, I know how the job titles are formed: first comes the employee’s seniority, then the department, then the position. For example, ‘Junior Sales Assistant’ means the employee is of junior seniority, is in Sales, and works as an assistant.

Using SQL, I can extract this as a substring:

SELECT first_name,
       last_name,
       job_title,
       SUBSTRING(job_title, LENGTH(job_title) - POSITION(' ' IN REVERSE(job_title))+2) AS position
FROM employees;

This is another example of omitting the length argument, albeit a little more complex. As always, I first specify the string column – job_title in this case. After that, I somehow need to find a substring consisting only of the last word in the job title.

I do this by first using LENGTH(). It returns the length of the string in the column job_title. That’s a start; it’s the length of all three words together, including the blank spaces. If I could somehow subtract from it the number of characters in the last word, then I would have the length of the first two words, which would then give me the start of the substring I want.

This is a little complicated because different job position names have different lengths. The only thing that separates the words is the blank space. So, to get the length of the third word in the string, I have to count the number of characters up to the blank space, but from the right.

The POSITION() function saves the day again, but this time combined with REVERSE(). The REVERSE() function reverses the string expression so that ‘Junior Sales Assistant’ becomes ‘tnatsissA selaS roinuJ’. The last word becomes the first one; the word itself is reversed, too, but that doesn’t matter here.

POSITION() finds the position of the blank space after the first word of the reversed string. This equals the place of the blank space before the last word in the original (non-reversed) string.

Phew! Now, if I subtract this number from the total length of the original string, I get the start of the substring, right?

Well, not quite! Using this difference as it produces a substring that includes the last letter of the second word and the blank space before the last word. Why is that?

Two things. The start argument of the SUBSTRING() function is inclusive. Also, POSITION() calculates the position of the blank space, not the number of characters up to the blank space. So, I have to add 2 to get this result:

first_namelast_namejob_titleposition
ClarenceWilkinsonJunior Sales AssistantAssistant
MirandaBrownSenior Sales SpecialistSpecialist
FrankDrebinJunior Sales ManagerManager
VivienKellySenior Sales ManagerManager
SteveStephensJunior Sales SpecialistSpecialist
NastassjaHarrisonJunior Sales SpecialistSpecialist
ThomasPetersonJunior Reporting SpecialistSpecialist
MathildeKinskiJunior Reporting AnalystAnalyst
MateuszWozniakSenior Reporting ExpertExpert
AineDoyleJunior Reporting ManagerManager
LorenzoAlfieriSenior Reporting ManagerManager
PetraBabićJunior HR AssistantAssistant
DuarteSimoesJunior HR AssistantAssistant
OlenaKostenkoSenior HR AssistantAssistant
LaurensGrotenhuisSenior HR ManagerManager

Now that I have introduced a few other functions, you may want to take a look at some other text functions that may be useful to you.

Learn More About SUBSTRING () and Working With Text Data

Now you know when and how to use SUBSTRING(). It’s time to practice!

There are other text functions, not only SUBSTRING(). You can find them (and much more!) in the Standard SQL Functions course.

till not sure if the course is for you? Here’s a detailed description of what the course covers and how it helps you.