19th Nov 2020 11 minutes read

What Is a SQL Constraint?

Data Engineering

Table of Contents

Understanding SQL Constraints
Examples of SQL Constraints
Using SQL Constraints
SQL Constraints Are a Powerful Tool

What are SQL constraints? When and why should you use them? We give examples of common constraints and demonstrate how to apply them.

When you’re creating a relational database, you often want to impose certain restrictions on some columns. For example, a column containing a social security number should store only unique values; a column containing a date of birth should not be null, or left empty. These conditions can be ensured by using SQL constraints.

In this article, I’ll explain what a SQL constraint is and describe its purposes and benefits. Then we’ll look at real-world applications of the most common SQL constraints.

Understanding SQL Constraints

A SQL constraint is a rule for ensuring the correctness of data in a table. Frequently used SQL constraints include:

NOT NULL – The column value cannot be empty (i.e. cannot contain a null value).
UNIQUE – The column cannot contain duplicate values (i.e. all values in the column must be different).
PRIMARY KEY – Each column value must uniquely identify that row in the table.

There are other SQL constraints that I’ll discuss later. First, let’s look at the Clients table, which stores information about Company X’s clients.

Id	FirstName	LastName	DateOfBirth	SocialSecurityNumber	Age	ActiveFlag
…	…	…	…	…	...	...

What rules might we want to impose on the Clients table?

FirstName, LastName, and DateOfBirth cannot be empty.
SocialSecurityNumber must store unique values.
Age must have values greater than or equal to 18.
ActiveFlag must be ‘Yes’ by default.

To implement these rules, we use SQL constraints.

SQL constraints are powerful and important database design utilities. For more info on mastering them, check out our SQL Constraints interactive course.

Examples of SQL Constraints

Let’s go through all the most common SQL constraints one by one. Each of them is explained with an example.

Example 1: NOT NULL

The NOT NULL constraint guarantees that the column will have a value for each record. In other words, the column will not allow empty values to be stored in it.

Let’s create a table that contains a column with the NOT NULL constraint:

CREATE TABLE Movies (
ReleaseYear INTEGER,
Title VARCHAR(50) NOT NULL
)

If we try to add this record to the Movies table …

INSERT INTO Movies VALUES (null, null)

… we get an error message saying: Column ‘Title’ cannot be null.

The only way to insert a row to the Movies table is to ensure that the value for column Title is not null:

INSERT INTO Movies VALUES (null, 'Titanic')

Please note that there is no constraint set on the column ReleaseYear; it can be null.

The NOT NULL constraint is used on columns required for a particular record entry, such as:

When we sign up for an email newsletter, we must provide our email address.
When registering at the hospital, we must provide our date of birth.

These required fields use the NOT NULL constraint to ensure that the user always provides a value.

Example 2: UNIQUE

The UNIQUE constraint ensures that no two values in a column are the same.

Let’s look at a table that uses the UNIQUE constraint:

CREATE TABLE Person (
DateOfBirth DATE,
SocialSecurityNumber INTEGER UNIQUE
)

Let’s insert the following record into the Person table:

INSERT INTO Person VALUES (DATE '1970-01-01', 123456789)

Now, suppose we run this exact INSERT statement again. We’ll see an error message: Duplicate entry ‘123456789’ for key ‘uniqueconstraint.SocialSecurityNumber’. This is because the column SocialSecurityNumber implements the UNIQUE constraint.

If we change the second INSERT statement to the following ...

INSERT INTO Person VALUES (DATE '1970-01-01', 123456788)

… it will add another record to the Person table.

Please note that the column DateOfBirth does not implement any constraints and can have any date value for any record.

The UNIQUE constraint is used for columns that cannot contain duplicate values. For example, when we are born, each of us receives a social security number that is unique to us. So if a table contains a column that stores social security numbers, this column must use the UNIQUE constraint to avoid two persons having the same number – even if it is inserted by human error.

Example 3: CHECK

The CHECK constraint checks the condition that follows it, e.g. CHECK Age>21 ensures that each Age column value is greater than 21.

Below is a table that implements the CHECK constraint:

CREATE TABLE Clients (
Name VARCHAR(30),
Email VARCHAR(40),
Age INTEGER CHECK (Age>=18),
ActiveFlag VARCHAR(3)
)

The CHECK constraint guarantees that each value in the Age column is greater than or equal to 18.

Let’s try inserting a record where the Age value is less than 18:

	INSERT INTO Clients VALUES (17, ‘Anthony’, ‘anthony@email.com’, ‘Yes’)

In the above case, we receive an error message: Check constraint ‘checkconstraint_chk_1’ is violated.

To add a record to the Clients table, we have to change the INSERT statement:

	INSERT INTO Clients VALUES (18, ‘Anthony’, ‘anthony@email.com’, ‘Yes’)

Now the CHECK constraint is satisfied.

The CHECK constraint can be used to implement custom constraints on a column. For example:

If a table must store only adults’ data, we could use a CHECK constraint on the Age column: CHECK (Age>=18),
If a table must store only US citizens’ data, we could use the CHECK constraint on the Country column: CHECK (Country=‘USA’).

Example 4: DEFAULT

The DEFAULT constraint sets the default value for a column. This value is used when the user doesn’t enter a value in a column.

Let’s see it in action. First, we create a table with two DEFAULT columns:

CREATE TABLE Zoo (
AnimalAge INTEGER DEFAULT 0,
AnimalName VARCHAR(20) DEFAULT 'Add name'
)

The default value for AnimalAge is 0 and for AnimalName it’s ‘Add name’. Let’s try inserting a record into the Zoo table:

INSERT INTO Zoo VALUES ()

As we inserted a record with no values provided, the default values are used for both the columns. Let’s check it by running a SELECT statement:

SELECT * FROM Zoo

AnimalAge	AnimalName
0	Add name

Please note that if we use the following INSERT statement ...

	INSERT INTO Zoo VALUES (10, ‘Winnie’)

… then the default values will be overridden.

The DEFAULT constraint should be used whenever we want a column to store a value for every record where the user doesn’t provide a value. For example, when adding a new client to the Clients table, the ActiveFlag column could be set to “Yes” by default.

Example 5: PRIMARY KEY

According to database normal forms, the PRIMARY KEY constraint should be used in every table in a relational database.

The PRIMARY KEY constraint follows these rules:

There can be only one PRIMARY KEY in a table.
PRIMARY KEY values uniquely identify each row; thus, all PRIMARY KEY column values must be unique.
Null values are not allowed in the PRIMARY KEY column.

Let’s look at an example:

CREATE TABLE Fruits (
Id INTEGER PRIMARY KEY,
Name VARCHAR(20)
)

In the above example, the column Id is the primary key for the Fruits table. This means that Id must have unique values and cannot have any null values.

For more information about the PRIMARY KEY constraint, read our article What is a Primary Key in SQL?.

Example 6: FOREIGN KEY

The FOREIGN KEY constraint creates a link with the PRIMARY KEY of another table. The column that implements the FOREIGN KEY constraint references a PRIMARY KEY column from another table; the current table is linked to the second table through this constraint.

Let’s see an example of a FOREIGN KEY constraint being linked to the PRIMARY KEY of another table:

CREATE TABLE FruitStorage (
StorageRoomNo INTEGER PRIMARY KEY,
Quantity INTEGER,
FruitId INTEGER,
FOREIGN KEY (FruitId) REFERENCES Fruits(Id)
)

The Id column from the Fruits table is a foreign key to the FruitStorage table via the column FruitId. These two tables are linked with each other. If any data is inserted in the FOREIGN KEY column that does not correspond to any value in the PRIMARY KEY column, the database will return an error. However, a FOREIGN KEY column will accept null values.

For example, a customer has only one Id, which belongs to the CustomerProfile table. But one customer can make many purchases, which are stored in the PurchaseRecord table. Here, the two tables are constrained by the PRIMARY KEY and FOREIGN KEY dependency relationship (a one-to-many relationship).

The FOREIGN KEY constraint is frequently used to create links between relational database tables. The article Referential Constraints and Foreign Keys in MySQL explains more about this constraint.

Using SQL Constraints

When to Use SQL Constraints

There are a number of situations when we should use SQL constraints to ensure data correctness. In this section, I’ll summarize all the SQL constraints that we’ve learned and implement our first real-world example.

Let’s imagine that we have a table called Clients:

Id	FirstName	LastName	DateOfBirth	SocialSecurityNumber	Age	ActiveFlag
…	…	…	…	…	...	...

When this table was created, we set up rules that must be followed by any application trying to manipulate this table’s data. These rules are:

The Id column is the PRIMARY KEY. It uniquely identifies each client record and cannot be null.
We need the client’s FirstName, LastName, and DateOfBirth; these columns could use the NOT NULL constraint.
The SocialSecurityNumber column will store unique values, and for that we can use the UNIQUE constraint.
All the clients must be adults, so we need to CHECK that Age >= 18.
We’ll assume that all clients are active, so let’s set the default value of the ActiveFlag column to ‘Yes’ using the DEFAULT constraint.

Data Definition Language (DDL) statements create and modify database objects, including tables. These DDL statements allow us to set and later modify constraints. We set the SQL constraints during table creation using the CREATE statement.

Let’s look at the DDL statement that includes all the constraints listed above. All the constraints are marked in red.

CREATE TABLE Clients (
Id INTEGER PRIMARY KEY,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
DateOfBirth INTEGER NOT NULL,
SocialSecurityNumber INTEGER UNIQUE,
Age INTEGER CHECK (Age>=18),
ActiveFlag VARCHAR(3) DEFAULT 'Yes'
)

To answer the question of when to use SQL constraints, let me say that, in general, we use SQL constraints to impose certain rules on the columns of particular tables. We decide what types of constraints to apply based on the business context of the table and also our common sense.

SQL constraints are generally used to create a database structure and make all the applications that use this database conform to our rules. As constraints are commonly set during table creation, be sure to check out our article on How to Create a Table in SQL.

Why Use SQL Constraints?

The usage of SQL constraints ensures that:

Data is correct (appropriate to the column).
Data is of good quality (no information is missing or of the wrong data type).
The applications that input the data must conform to the rules/constraints we set.

By using SQL constraints, we save time spent checking that the data has the required correctness. Constraints ensure that if future INSERTs or UPDATEs do not conform to the rules, they will not be added to our database.

Adding Constraints: Modifying Table Definitions with DDL

We can modify a table’s SQL constraints using the ALTER statement with the ADD CONSTRAINT clause. The ALTER statement must consider the current data in the table. It can’t contradict existing data, e.g. if column X contains duplicate values, we cannot set a UNIQUE constraint on this column unless the duplicates are removed.

Let’s see an example of altering constraints. First, we’ll make a table without any kind of constraint:

CREATE TABLE People (
Id INTEGER,
FirstName VARCHAR(20),
Age INTEGER,
Email VARCHAR(40)
)

The above table does not implement any of the SQL constraints. However, we can use the ALTER TABLE statement to add them:

ALTER TABLE People
ADD CONSTRAINT MyPrimaryKey PRIMARY KEY (Id);
ALTER TABLE People
ADD CONSTRAINT AgeCondition CHECK (Age>=18);
ALTER TABLE People
ADD CONSTRAINT EmailCondition UNIQUE(Email);

After running the above ALTER TABLE statements, we create three constraints on the People table:

The Id column becomes the table’s primary key.
Age values must be greater than or equal to 18.
Email column values must be unique.

Now, if we would like to remove one of the above constraints from the People table, we use the ALTER statement with the DROP CONSTRAINT clause:

ALTER TABLE People
DROP CONSTRAINT EmailCondition

By executing the above ALTER statement, the EmailCondition constraint is removed and the People table will accept duplicate values for the Email column.

SQL Constraints Are a Powerful Tool

SQL constraints allow you to impose certain conditions and rules on your data, helping you to ensure its accuracy and completeness. Let’s quickly summarize what the different constraints do:

UNIQUE makes sure all column values are different.
NOT NULL forces each field to have a value.
CHECK allows you to set customized constraints.
DEFAULT enters a specified value into a field that the user leaves blank. It can also be used to set a default value, like setting IsActive to “yes”.
PRIMARY KEY uniquely identifies each record in a table.
FOREIGN KEY refers back to another table’s primary key and links two tables together based on column values – which is very essential in relational databases.

I’m sure you’d agree that SQL constraints are very crucial to good database design as well as data correctness. Why not practice what you’ve learned in this article in our Creating Database Structure track?

Tags:

Data Engineering