7th Jul 2021 13 minutes read

Primary Key vs. Unique Key: Explaining the Differences

Data Engineering

Table of Contents

All About the Primary Key
All About the Unique Key
Primary Key vs. Unique Key
The Factors in Deciding Between a Primary Key vs. a Unique Key

When designing a database, we often need to decide between defining a primary key vs. a unique key. Both of them are crucial during the design and further phases of a database. They allow us to uniquely identify each row, ensure the uniqueness of values in the column(s), and more. Let’s take a deep dive to learn more.

This article will go through the primary and unique keys, their functions, and features. For each type of keys, we will present the following:

Examples, using single or multiple columns, with extensive discussion on the syntax of the code.
Their benefits.
Their

At the end, we’ll briefly compare the primary and unique keys and summarize their differences.

Let’s get started!

All About the Primary Key

Let’s start with the most basic definition of the primary key. A primary key ensures that every row of a table has a unique and non-null identifier. It is a database constraint available in all relational databases. You can define a primary key on single or multiple columns as we will see in the examples below.

It is advised that every table have a primary key, as it allows not only the unique identification of rows but also the linking of the tables using primary and foreign keys. To explain, a foreign key of a table is a primary key from another table used to link the two tables to each other.

Check out the article on What is a Primary Key in SQL? to learn more about the primary key constraint.

Now, we can look at some usage examples to get a better grasp on how to define a primary key.

Defining a Primary Key on a Single Column

We’ll start with creating the Orders table to store the order data of a webshop.

CREATE TABLE Orders (
    OrderId integer primary key,
    CustomerFirstName varchar(50),
    CustomerLastName varchar(70),
    OrderDate date,
    TotalCost decimal(15,2)
);

Alternatively

CREATE TABLE Orders (
    OrderId integer,
    CustomerFirstName varchar(50),
    CustomerLastName varchar(70),
    OrderDate date,
    TotalCost decimal(15,2),
    CONSTRAINT single_column_pk PRIMARY KEY (OrderId)
);

To create a new table in the database, we use a CREATE TABLE statement followed by the name of the table. Next, we list the columns along with the types of data that they can store:

The OrderId column is of the integer data type. It is also the primary key for the Orders Hence, it can store only integers; furthermore, all the values must be unique and non-null due to the primary key constraint.
The CustomerFirstName column stores the first name of the customer who placed the It is of the data type varchar(50), meaning it can store names up to 50 characters.
The CustomerLastName column stores the last name of the customer who placed the order. It is of the data type varchar(70), meaning it can store names up to 70 characters.
The OrderDate column stores the date of the order. It is of the data type date.
The TotalCost column stores the total value of the order. It is of the data type decimal(15,2), meaning it can store up to 15 digits of which 2 are to the right of the decimal point.

To get some hands-on practice with the CREATE TABLE statement, check out our course on Creating Tables in SQL. This course walks you through not only the creation of tables but also the usage of primary keys, unique keys, foreign keys, and constraints. It also touches on how to drop and alter tables as well as how to create views.

Now that our table is ready, we can insert some values into it.

INSERT INTO Orders VALUES (1, 'Jim', 'Bean', '01-04-2021', 159.99);
INSERT INTO Orders VALUES (2, 'Cady', 'Johnson', '10-10-2020', 399.99);
INSERT INTO Orders VALUES (3, 'Bob', 'Black', '06-06-2021', 255.99);

Please note that inserting a duplicate or null value into the OrderId primary key column results in an error.

The Orders table is now as shown below.

OrderId	CustomerFirstName	CustomerLastName	OrderDate	TotalCost
1	Jim	Bean	01-04-2021	159.99
2	Cady	Johnson	10-10-2020	399.99
3	Bob	Black	06-06-2021	255.99

It is also possible to define a primary key on more than one column. This is called a composite primary key. Let’s see how to do that in the next section.

Defining a Primary Key on Multiple Columns

The webshop manager now wants to define a unique order ID for each day and not in general as implemented in the previous example. This is where we need a composite primary key.

To implement the manager’s request, we must add the OrderDate column to the primary key so that now the OrderId and OrderDate columns are the composite primary key of the Orders table.

Before jumping into the code, let’s clear up the concept of the composite primary key. Here, the combination of the columns must be unique, not each column separately. The possible combinations of the values are shown below.

OrderId	OrderDate
1	01-04-2021
2	01-04-2021
3	01-04-2021
1	10-10-2020
2	10-10-2020

Had the manager made this decision before creating the Orders table, we could have made use of the CREATE TABLE statement as in the previous example to define the primary key differently.

CREATE TABLE Orders (
    OrderId integer,
    CustomerFirstName varchar(50),
    CustomerLastName varchar(70),
    OrderDate date,
    TotalCost decimal(15,2),
    CONSTRAINT composite_pk PRIMARY KEY (OrderId, OrderDate)
);

However, in our case, the change in the Orders table implementation came long after the table was created. So instead, we need to adjust the primary key constraint for the Orders table by dropping the existing primary key and creating a new one.

ALTER TABLE Orders DROP CONSTRAINT single_column_pk;
ALTER TABLE Orders ADD CONSTRAINT composite_pk PRIMARY KEY (OrderId, OrderDate);

The primary key of the Orders table now consists of the OrderId and OrderDate columns. It allows duplicate values in the OrderId column and in the OrderDate column, but the combination of the two must still be unique and non-null to satisfy the requirements of the primary key constraint. Neither the OrderId nor OrderDate column can contain any null values.

Benefits of Using the Primary Key

The primary key became a standard feature of relational databases for a reason! It offers numerous benefits for developers and users. Let’s list at least some of them.

With a primary key, you can uniquely identify each row of a For example, you can update or delete only the specific rows by uniquely identifying them.
It ensures that all the values in a primary key column, or a set of columns, are unique. The database throws an error if you try to insert a duplicate value into a primary key column.
It ensures that each row contains a non-null value. A primary key column is an identifier for each record, and thus, any attempt to insert a null value results in an error.
A primary key facilitates the usage of foreign keys to link between tables. The primary key column of Table A is going to be the foreign key column when used in Table B, and that’s how the link is created between these tables.
A primary key allows for using indexes created on it. That leads to faster database operations such as searching. The data in a table is sorted by its primary key.

Disadvantages of Using the Primary Key

Despite the many benefits of the primary key, you can find a few disadvantages that might cause you some problems at times.

When implementing a column to be a primary key, you must ensure that there won’t be any null values.
There can be only one primary key per table, although a primary key may consist of many columns. If you want to modify the primary key, you should first remove the existing one then create a new one as shown in the last example.
An index is automatically created on a primary key column. This may cause updates to be slower, because the indexes must be updated as well.

If you still can’t help but wonder why a primary key is needed at all, read our article on Why Do You Need a Primary Key in a Database Table? to get a deeper insight into it.

All About the Unique Key

The unique key is also a database constraint. It does exactly what the name indicates – i.e., it ensures that the column does not contain any duplicate values. It can be defined on single or multiple columns. However, in contrast to the primary key, it allows null values.

You should define a unique key on every column that stores only unique values but does not contribute to the identification of unique rows and that may contain null values.

Check out the article on What Is a Unique Constraint in SQL? to learn more about the unique key constraint.

Let’s look at some examples to get a better understanding of the unique key.

Defining a Unique Key on a Single Column

Let’s create the Customers table for our webshop. This table stores the customer data and could be later linked with the previously implemented Orders table to avoid data duplication.

CREATE TABLE Customers (
    CustomerId integer primary key,
    FirstName varchar(50),
    LastName varchar(70),
    PhoneNumber varchar(15) unique,
    Email varchar(60) unique
);

Alternatively,

CREATE TABLE Customers (
    CustomerId integer primary key,
    FirstName varchar(50),
    LastName varchar(70),
    PhoneNumber varchar(15),
    Email varchar(60),
    CONSTRAINT unique_phonenumber UNIQUE (PhoneNumber),
    CONSTRAINT unique_email UNIQUE (Email)
);

As before, we use the CREATE TABLE statement followed by the name of the table to create a new table in the database. Next, we list the columns along with the types of data that they can store:

The CustomerId column is of the integer data type. It is also a primary key for the Customers table. Hence, it can store only integers, and all the values must be unique and non-null because of the primary key constraint.
The FirstName column stores the first name of the It is of the data type varchar(50), meaning it can store names up to 50 characters.
The LastName column stores the last name of the It is of the data type varchar(70), meaning it can store names up to 70 characters.
The PhoneNumber column stores the phone number of the It is of the data type varchar(15). It is also a unique key. Hence, the phone number must be unique to each customer.
The Email column stores the email address of the customer, and it is of the data type varchar(60). It is also a unique key. Hence, the email address must be unique to each customer.

Let’s insert some values into the Customers table.

INSERT INTO Customers VALUES (1, 'Jim', 'Bean', null, 'jim.b@email.com');
INSERT INTO Customers VALUES (2, 'Cady', 'Johnson', '0987456345', null);
INSERT INTO Customers VALUES (3, 'Bob', 'Black', '0567345876', 'bob.black@email.com');

Please note that inserting a duplicate value into the PhoneNumber or Email column results in an error. Also, inserting duplicate or null values into the CustomerId column results in an error, since it is a primary key column.

The Customers table is now as follows.

CustomerId	FirstName	LastName	PhoneNumber	Email
1	Jim	Bean	null	jim.b@email.com
2	Cady	Johnson	0987456345	null
3	Bob	Black	0567345876	bob.black@email.com

It is also possible to define a unique key on multiple columns. Let’s see an example of how this works.

Defining a Unique Key on Multiple Columns

In the previous example, we implemented a unique key constraint on the PhoneNumber column and on the Email column separately. This meant that each column separately had to store unique values. You might wonder why we didn't use the composite unique key.

If we define a composite unique key containing the PhoneNumber and Email columns, then the columns would be allowed to store duplicate values as long as the combination of values in both columns is unique – much like in the case of the primary key.

Nevertheless, we could decide to implement it as such, and it would be analogous to the composite primary key.

CREATE TABLE Customers (
    CustomerId integer primary key,
    FirstName varchar(50),
    LastName varchar(70),
    PhoneNumber varchar(15),
    Email varchar(60),
    CONSTRAINT unique_phonenumber_email UNIQUE (PhoneNumber, Email) 
);

Or, if the table already exists, we can drop the existing unique constraints (although it is not necessary) and create a new one.

ALTER TABLE Customers DROP CONSTRAINT unique_phonenumber;
ALTER TABLE Customers DROP CONSTRAINT unique_email;
ALTER TABLE Customers ADD CONSTRAINT unique_phonenumber_email UNIQUE (PhoneNumber, Email);

The customer phone numbers and emails may now be duplicated; however, the combination of the two must be unique.

Please note that inserting a null value to either column may result in duplicate values by the other column, like this:

PhoneNumber	Email
0123456789	null
0123456789	null
null	cady@email.com
null	cady@email.com

This is a disadvantage of the composite unique key.

Benefits of Using the Unique Key

Let’s list some of the benefits of the unique keys.

The unique key ensures that there won’t be any duplicate values in a column that implements it.
In contrast to the primary key, there can be multiple unique keys used in one table as presented in the first example of this section.
You don’t have to prevent the insertion of null values, as null values are allowed in a unique key column.
As is the case with primary keys, unique keys can be used as foreign keys to create links between tables.
The usage of unique keys improves database performance as queries can be planned more wisely by the database optimizer.

Disadvantages of Using the Unique Key

The unique keys do have disadvantages. Let’s look at some of them.

The primary key includes unique and non-null constraints. The unique key, on the other hand, includes only the unique constraint and therefore does not prevent any missing data.
In the case of the multicolumn unique key, if any of the unique key columns is nullable, then it does not prevent having duplicate values in the other column(s) of the unique key. For example, if a customer has a null phone number and the email “test@email.com”, another customer can also have a null phone number and the same email “test@email.com” without generating any error for the duplicate email.
There is also a small performance impact due to the maintenance of the indexes.

Primary Key vs. Unique Key

The primary and unique keys are quite similar. Both prevent duplicates and ensure the uniqueness of values, and both can be used as foreign keys.

A primary key does not allow null values, making it perfect for record identification, as it ensures that each record has a unique and non-null identifier. Hence, there can be only one primary key defined in a table to avoid having more than one identifier per record.

In contrast, a unique key allows null values and only ensures the uniqueness of the non-null values. You can define a unique key on a column that requires its values to be unique but not necessarily non-null. You can define multiple unique keys in one table, as there may be multiple columns that store only unique values.

A primary key and a unique key can both be present in a table (as shown in some of the examples); however, it makes no sense to define both a unique key and a primary key on the same column(s). In cases like this, you should stick with the primary key – not only does it ensure uniqueness just like the unique key, but it also ensures that no null values are stored.

Let’s look at an example.

CustomerId	FirstName	LastName	PhoneNumber	Email
1	Jim	Bean	null	jim.b@email.com
2	Cady	Johnson	0987456345	c.johnson@email.com
3	Bob	Black	0567345876	bob.black@email.com

If the FirstName, LastName, and Email columns of the Customers table make up the composite primary key, the Email column should not implement the unique key constraint, since the uniqueness of that column is already implied by the primary key constraint. It is also required that each customer provide his/her email address. The PhoneNumber column is a unique key column, so customers are not required to provide a phone number.

The Factors in Deciding Between a Primary Key vs. a Unique Key

The message to take home is the following: both primary and unique keys ensure the uniqueness of values, but only the primary key ensures that no null values are stored in a column. Based on this, you can easily decide which key you should select for your specific application.

The primary key can be defined only once in a table, whereas the unique key can be defined multiple times in a table. This feature follows from the first paragraph of this section and should be considered when deciding on the keys.

Also, both primary and unique keys can improve database performance by providing indexes used in performing various database operations, such as queries, updates, and deletions.

We have even more examples for you to practice on your own. Check out the cookbooks on How to Create a Primary Key in SQL and How to make a column unique in SQL ? to deepen your understanding of the concepts discussed in this article.

Good luck!

Tags:

Data Engineering