Analytic functions: NTILE
Comments enabled. I *really* need your comment
In the previous article we dealt with analytic functions SUM
, AVG
and ROW_NUMBER()
.
Now we will try to emulate NTILE
.
NTILE(N)
is a special function that has no aggregate analog. It divides each grouping set of rows into N
subranges, based on ORDER BY
clause, and returns the subrange number for each row.
If we query for NTILE(5) OVER (PARTITION BY grouper ORDER BY id)
, it will return 1 for the 20% of rows with lowest id
, 2 for the next 20% of rows, etc.
Easy to see that we need a COUNT
for each grouper
to use in our calculations. Unlike SUM
, AVG
and ROW_NUMBER()
, the value of this function does not depend only on value of the previous queries.
That's why we'll need an extra join:
SELECT ao.*, FLOOR((@r * @_n) / cnt) + 1 AS `NTILE(5) OVER (PARTITION BY grouper ORDER BY id)` FROM ( SELECT @_n := 5, @_grouper := 'N' ) vars, ( SELECT a.*, cnt FROM ( SELECT grouper, COUNT(*) AS cnt FROM t_aggregator GROUP BY grouper ) cnt, t_aggregator a WHERE a.grouper = cnt.grouper ORDER BY grouper, id ) ao WHERE CASE WHEN @_grouper <> grouper THEN @r := -1 ELSE 0 END IS NOT NULL AND (@r := @r + 1) IS NOT NULL AND (@_grouper := grouper) IS NOT NULL
As you can see, we select the COUNT(*)
of rows for each grouper and join the results of the query with t_aggregator
, so along with each row of t_aggregator
we have the total count of rows for the given grouper
.
In the SELECT
clause, we just count the percentile our current row belongs to, multiply it by N
and add a 1 (as NTILE
results are 1-based).