Analytic functions: FIRST_VALUE, LAST_VALUE, LEAD, LAG
In the previous articles I wrote about emulating some of the analytic functions in MySQL.
Today, I’ll write about four more userful functions: FIRST_VALUE, LAST_VALUE, LEAD and LAG.
These functions also do not have aggregate analogs.
FIRST VALUE(column) returns the value of column from the first row of the grouping set.
LAST_VALUE(column) returns the value of column from the last row of the grouping set.
This can be illustrated by the following query:
SELECT m.*,
FIRST_ROW(month) OVER (PARTITION BY season ORDER BY id),
LAST_ROW(month) OVER (PARTITION BY season ORDER BY id)
FROM t_month m
| id | season | month | FIRST_VALUE(month) OVER (PARTITION BY season ORDER BY id) | FIRST_VALUE(month) OVER (PARTITION BY season ORDER BY id) |
|---|---|---|---|---|
| 9 | Fall | September | September | November |
| 10 | Fall | October | September | November |
| 11 | Fall | November | September | November |
| 3 | Spring | March | March | May |
| 4 | Spring | April | March | May |
| 5 | Spring | May | March | May |
| 6 | Summer | June | June | August |
| 7 | Summer | July | June | August |
| 8 | Summer | August | June | August |
| 1 | Winter | January | January | December |
| 2 | Winter | February | January | December |
| 12 | Winter | December | January | December |
This query selects the first and the last month of each row’s season.
In this example above, January is the first month and December is the last month of winter, as January, 2009 comes earlier than December, 2009.
In MySQL, we can emulate FIRST_ROW() with the following construction:
SELECT *,
@r AS `FIRST_VALUE(month) OVER (PARTITION BY season ORDER BY id)`
FROM (
SELECT m.*
FROM (
SELECT @_season = NULL
) vars,
t_month m
ORDER BY
season, id
) mo
WHERE (CASE WHEN @_season IS NULL OR @_season <> season THEN @r := month ELSE month END IS NOT NULL)
AND (@_season := season) IS NOT NULL
| id | season | month | FIRST_VALUE(month) OVER (PARTITION BY season ORDER BY id) |
|---|---|---|---|
| 9 | Fall | September | September |
| 10 | Fall | October | September |
| 11 | Fall | November | September |
| 3 | Spring | March | March |
| 4 | Spring | April | March |
| 5 | Spring | May | March |
| 6 | Summer | June | June |
| 7 | Summer | July | June |
| 8 | Summer | August | June |
| 1 | Winter | January | January |
| 2 | Winter | February | January |
| 12 | Winter | December | January |
In this case, we just assign the value of month to @r whenever the season changes.
What about LAST_ROW()?
By analogy, we’ll just assign a…
Wait. When we enter the new set, we don’t yet have something to assign! We’ll have it only when the set ends, but we already need to output the intermediate results.
We’d better change the ordering in the ORDER BY clause so that id‘s get selected in the descending order. In this case, FIRST_ROW() will become LAST_ROW() and vise versa:
SELECT *,
@r AS `LAST_VALUE(month) OVER (PARTITION BY season ORDER BY id)`
FROM (
SELECT m.*
FROM (
SELECT @_season = NULL
) vars,
t_month m
ORDER BY
season, id DESC
) mo
WHERE (CASE WHEN @_season IS NULL OR @_season <> season THEN @r := month ELSE month END IS NOT NULL)
AND (@_season := season) IS NOT NULL
| id | season | month | LAST_VALUE(month) OVER (PARTITION BY season ORDER BY id) |
|---|---|---|---|
| 11 | Fall | November | November |
| 10 | Fall | October | November |
| 9 | Fall | September | November |
| 5 | Spring | May | May |
| 4 | Spring | April | May |
| 3 | Spring | March | May |
| 8 | Summer | August | August |
| 7 | Summer | July | August |
| 6 | Summer | June | August |
| 12 | Winter | December | December |
| 2 | Winter | February | December |
| 1 | Winter | January | December |
Using the same principle, we can also emulate LEAD and LAG.
LAG(column) returns the value of column from the previous row in the grouping set, or NULL if there is no previous row.
LEAD(column) returns the value of column from the next row in the grouping set, or NULL if there is no next row.
SELECT mo.id, mo.season,
@r AS `LAG(month) OVER (PARTITION BY season ORDER BY id)`,
(@r := month) AS month
FROM (
SELECT m.*
FROM (
SELECT @_season = NULL,
@s := NULL
) vars,
t_month m
ORDER BY
season, id
) mo
WHERE (CASE WHEN @_season IS NULL OR @_season <> season THEN @r := NULL ELSE NULL END IS NULL)
AND (@_season := season) IS NOT NULL
| id | season | LAG(month) OVER (PARTITION BY season ORDER BY id) | month |
|---|---|---|---|
| 9 | Fall | September | |
| 10 | Fall | September | October |
| 11 | Fall | October | November |
| 3 | Spring | March | |
| 4 | Spring | March | April |
| 5 | Spring | April | May |
| 6 | Summer | June | |
| 7 | Summer | June | July |
| 8 | Summer | July | August |
| 1 | Winter | January | |
| 2 | Winter | January | February |
| 12 | Winter | February | December |
SELECT mo.id, mo.season,
@r AS `LEAD(month) OVER (PARTITION BY season ORDER BY id)`,
(@r := month) AS month
FROM (
SELECT m.*
FROM (
SELECT @_season = NULL,
@s := NULL
) vars,
t_month m
ORDER BY
season, id DESC
) mo
WHERE (CASE WHEN @_season IS NULL OR @_season <> season THEN @r := NULL ELSE NULL END IS NULL)
AND (@_season := season) IS NOT NULL
| id | season | LEAD(month) OVER (PARTITION BY season ORDER BY id) | month |
|---|---|---|---|
| 11 | Fall | November | |
| 10 | Fall | November | October |
| 9 | Fall | October | September |
| 5 | Spring | May | |
| 4 | Spring | May | April |
| 3 | Spring | April | March |
| 8 | Summer | August | |
| 7 | Summer | August | July |
| 6 | Summer | July | June |
| 12 | Winter | December | |
| 2 | Winter | December | February |
| 1 | Winter | February | January |
In this query, we assign a new value of month to @r in the SELECT clause, and we need to change the order or columns a little so that the new value is assigned only after the old value is selected.
If it is not acceptable for some reason, we’ll need to wrap this query into a subquery, but I won’t do it in this example for the sake of brevity.
To be continued.







Subscribe in a reader