Back to articles list Articles Cookbook
11 minutes read

The LAG Function and the LEAD Function in SQL

LAG() and LEAD() are positional functions. A positional function is a type of window function. If you are not familiar with when and how to use them, what the syntax of each function is, why you might want to use them, and what the differences are, read on!

LAG() and LEAD() are positional functions. These are window functions and are very useful in creating reports, because they can refer to data from rows above or below the current row. In this article, we will look at these two in detail.

To learn how window functions work, what functions there are, and how to apply them to real-world problems, it’s best to take our Window Functions course. You can find all about the course here. It’s interactive, there are 218 exercises, and you only need a web browser and some basic SQL knowledge.

The Syntax of the LAG Function

The LAG() function allows access to a value stored in a different row above the current row. The row above may be adjacent or some number of rows above, as sorted by a specified column or set of columns.

Let’s look its syntax:

LAG(expression [,offset[,default_value]]) OVER(ORDER BY columns)

LAG() takes three arguments: the name of the column or an expression from which the value is obtained, the number of rows to skip (offset) above, and the default value to be returned if the stored value obtained from the row above is empty. Only the first argument is required. The third argument (default value) is allowed only if you specify the second argument, the offset.

As with other window functions, LAG() requires the OVER clause. It can take optional parameters, which we will explain later. With LAG(), you must specify an ORDER BY in the OVER clause, with a column or a list of columns by which the rows should be sorted.

Let’s consider the following table, sale:

idseller_namesale_value
3Stef7000
1Alice12000
2Mili25000

And the following query with a LAG() function:

SELECT seller_name, sale_value,
  LAG(sale_value) OVER(ORDER BY sale_value) as previous_sale_value
FROM sale;

Here is the result:

seller_namesale_valueprevious_sale_value
Stef7000NULL
Alice120007000
Mili2500012000

This simplest use of LAG() displays the value from the adjacent row above. For example, the second record displays Alice’s sale amount ($12,000) with Stef’s ($7,000) from the row above, in columns sale_value and previous_sale_value, respectively. Notice that the first row does not have an adjacent row above, and consequently the previous_sale_value field is empty (NULL), since the row from which the value of sale_value should be obtained does not exist.

If you specify only the required argument (the name of the column or other expression) as we have in this example, the offset argument defaults to 1 and the third argument defaults to NULL. In our example, the first row in the result set has NULL in previous_sale_value and in the other rows are the values from the respective rows immediately above, because the offset is 1.

Using LAG(), you can see the value for the current row as well as the value from the adjacent row above. You can use this, for example, to check the sale amount of a given row against that of the previous row with the amount of sale sorted from the lowest to the highest.

The illustration below shows how the value from the adjacent row above is appended to the current row.

Table

The LAG() function is included in our handy “SQL Window Functions Cheat Sheet”.

The Syntax of the LEAD Function

LEAD() is similar to LAG(). Whereas LAG() accesses a value stored in a row above, LEAD() accesses a value stored in a row below.

The syntax of LEAD() is just like that of LAG():

LEAD(expression [,offset[,default_value]]) OVER(ORDER BY columns)

Just like LAG(), the LEAD() function takes three arguments: the name of a column or an expression, the offset to be skipped below, and the default value to be returned if the stored value obtained from the row below is empty. Only the first argument is required. The third argument, the default value, can be specified only if you specify the second argument, the offset.

Just like LAG(), LEAD() is a window function and requires an OVER clause. And as with LAG(), LEAD() must be accompanied by an ORDER BY in the OVER clause.

We again look at the table, sale:

idseller_namesale_value
3Stef7000
1Alice12000
2Mili25000

Here’s a query with a LEAD() function:

SELECT seller_name, sale_value,
  LEAD(sale_value) OVER(ORDER BY sale_value) as next_sale_value
FROM sale;

Here is the result set:

seller_namesale_valuenext_sale_value
Stef700012000
Alice1200025000
Mili25000NULL

The rows are sorted by the column specified in ORDER BY (sale_value). The LEAD() function grabs the sale amount from the row below. For example, Stef’s own sale amount is $7,000 in the column sale_value, and the column next_sale_value in the same record contains $12,000. The latter comes from the sale_value column for Alice, the seller in the next row. Note that the last row does not have a next row, so the next_sale_value field is empty (NULL) for the last row.

If you specify only the required argument, that is, only the name of the column or other expression, the offset defaults to 1 and the third argument defaults to NULL. In our example, the value for Alice’s next_sale_value is from the column sale_value of the adjacent row below, since the default offset is 1.

Using LEAD(), you can compare values across rows. The following illustration shows how the amount returned by LEAD() is appended to the current row.

Table

Using LAG() and LEAD() to Compare Values

An important use for LAG() and LEAD() in reports is comparing the values in the current row with the values in the same column but in a row above or below.

Consider the following table, annual_sale, shown below:

yeartotal_sale
201523000
201625000
201734000
201832000
201933000

As you can see, this table contains the total sale amount by year. Using LAG() and LEAD(), we can compare annual sale amounts across years.

Let’s look at this query:

SELECT  
   year,
   total_sale AS current_total_sale,
   LAG(total_sale) OVER(ORDER BY year) AS previous_total_sale,
   total_sale - LAG(total_sale) OVER(ORDER BY year) AS difference
FROM annual_sale;

Here is the result set:

yearcurrent_total_saleprevious_total_saledifference
201523000NULLNULL
201625000230002000
201734000250009000
20183200034000-2000
201933000320001000

This query grabs the sale amount from the previous year and puts it into the column previous_total_sale using the LAG() function. The ORDER BY in the OVER clause orders the records by year, ensuring that the adjacent row above represents the previous year. Then it takes the amount from the current_total_sale column in the previous row and brings it over to the current row.

This query also calculates the difference in sale amount between the current year and the previous year. This can help us understand if there was an increase (positive difference) or a decrease (negative difference) in sales from one year to the next.

For 2015, we have no information about the previous year. Therefore, the value returned by the LAG() function is NULL and so is the difference. Now, the total sale in 2018 was $32,000, but it was $34,000 in 2017 (the previous year) as shown in the column previous_total_sale. The difference is -$2,000, indicating that in 2018, there was a decrease in sales of $2,000 compared to the year 2017.

Using LAG() and LEAD() With a Specific Offset

You can use LAG() and LEAD() functions with two arguments: the name of the column and the offset.

Consider the following table, employee:

employee_idyearquarterbonus
120171100
120172250
12017360
12017420
12018180
12018280
1201830
1201840
1201910
120192100
1201930
120194150

The query below selects the bonus for the employee with ID=1 for each quarter of each year. It then identifies the bonuses for the corresponding quarter in the year before and the year after.

SELECT year, quarter,
  LAG(bonus,4) OVER(ORDER BY year,quarter) AS previous_bonus,
 bonus AS current_bonus,
  LEAD(bonus,4) OVER(ORDER BY year,quarter) AS next_bonus 
FROM employee
WHERE employee_id=1;

This query returns the following result set:

yearquarterprevious_bonuscurrent_bonusnext_bonus
20171NULL10080
20172NULL25080
20173NULL600
20174NULL200
20181100800
2018225080100
201836000
20184200150
20191800NULL
2019280100NULL
2019300NULL
201940150NULL

The rows highlighted in green are the records for the first quarter of each year, the rows in white the second quarter of each year, etc. In any given row, the previous and the next bonus amounts are taken from the corresponding quarter of the year prior and the year following, they are assigned to columns previous_bonus and next_bonus, respectively.

For example, the employee ID=1 received an $80 bonus in the first quarter of 2018. For the same employee, the 2017 first quarter bonus was $100, and the 2019 first quarter bonus was $0. The ORDER BY specifies that the rows be sorted by year and quarter. An offset of 4 tells LEAD() and LAG() to skip 4 rows before and after the current row, respectively. With this offset, you can compare values of the same quarter from different years, because there are 4 quarters in a year. The picture below illustrates this idea.

Table

You can find more about LAG() and LEAD() functions in our articles “Common SQL Window Functions: Positional Functions” by Aldo Zelen and “When Do I Use SQL Window Functions?” by Tihomir Babic.

Using LAG() and LEAD() With a Default Value

In the previous section, we discussed how to use the offset argument in LAG() and LEAD(). Now we consider cases with a third argument: the default value to assign when the value obtained is NULL. To specify this argument, you must also specify the second argument, the offset. The default offset is 1, so specify 1 to keep the default offset or some other value appropriate for your case.

Let’s look at another example. The following table, sale_product, contains the product IDs, the month (1 = January, 2 = February, etc.), and the count sold by month.

Here are the records for which the product ID is 1.

product_idmonthcount
11125
12135
13NULL
1490

The query:

SELECT product_id, month,
  LAG(count,1,0) OVER(ORDER BY month) AS previous_count,
  count AS current_count,
  count - LAG(count,1,0) OVER(ORDER BY month) AS difference
FROM sale_product
WHERE product_id=1;

returns the result:

product_idmonthprevious_countcurrent_countdifference
110125125
1212513510
13135NULLNULL
14NULL90NULL

For the product with ID=1, we select the sale month, the sale count for this month (current_count), and the sale count from the previous month (the value from the previous row returned by LAG()).

We would like to show zero instead of NULL when LAG() tries to obtain values from rows beyond those that exist in our data set. For both LAG() and LEAD(), this is done by specifying a third argument, the default value. Remember that the offset argument is required in order to specify the default value argument; here, we specify an offset of 1 to look at the row above. We then specify 0 as the third argument. This sets to zero any attempt to obtain values from rows that do not exist, as is the case here for the first row (there is no row above the first row).

Note that the default value of zero is assigned only for rows that do not exist; the rows whose adjacent rows above do exist but with NULLs in current_count are left alone as NULLs instead of changing them to 0. You can see this in the row where month=4: although the current_count for the row above (month=3) is NULL, it is not replaced with a zero, since the previous row does exist and only happens to contain a NULL in current_count.

Using LAG() and LEAD() With Partitions

Let’s continue with the same example, but now we examine a case in which we need PARTITION BY in the OVER clause. Below is the next part of the table sale_product with another product whose ID=2.

product_idmonthcount
11125
12135
13NULL
1490
21150
22100
23185
24190

The query:

SELECT product_id, month,
  LAG(count,1,0) OVER(PARTITION BY product_id ORDER BY month) AS previous_count,
  count AS current_count,
  count - LAG(count,1,0) OVER(PARTITION BY product_id ORDER BY month) AS difference
FROM sale_product;

returns the result:

product_idmonthprevious_countcurrent_countdifference
110125125
1212513510
13135NULLNULL
14NULL90NULL
210150150
22150100-50
2310018585
241851905

We have multiple products in this table. To calculate the differences between the current sales and the previous sales separately for each product, we specify PARTITION BY before ORDER BY in the OVER clause.

You can name a column or a list of columns in PARTITION BY. Here, we use the product_id column to divide the records into partitions then sort by month within each partition. As a result, every partition starts with Month 1 and ends with Month 4.

PARTITION BY is similar to GROUP BY in that it groups the same values together. Unlike GROUP BY, however, PARTITION BY does not collapse the original rows into a single row; the original rows remain accessible. You can read more about the difference between these PARTITION BY and GROUP BY in the article “What Is the Difference Between a GROUP BY and a PARTITION BY?”.

LAG and LEAD Functions Are Very Useful!

Positional functions such as LAG() and LEAD() are useful in many situations. They are often used in creating reports, because they can refer to the rows above or below, as we have seen from these examples. I hope that this article helps you extend your SQL knowledge of window functions. Read more on positional functions in the articles “Common SQL Window Functions: Positional Functions” by Aldo Zelen and “When Do I Use SQL Window Functions?” by Tihomir Babic. If you are interested in learning more about window functions, try our interactive course “Window Functions” on the LearnSQL.com platform.