EXPLAIN EXTENDED

How to create fast database queries

SQL Server: finding date closest to a birthdate

Comments enabled. I *really* need your comment

From Stack Overflow:

The problem is with dynamically getting upcoming what we call namedays in Europe.

Name day is a tradition in many countries in Europe and Latin America of celebrating on a particular day of the year associated with the one's given name.

Here's a sampe name day table (I store the date in this format: 1900-08-22, but we really only need month and day):

First name Name day
Bob 1900-04-22
Bob 1900-09-04
Frank 1900-01-02

The trick is that there might be multiple name days for each name, and someones name day is always the first one found after ones birthday.

So if Bob was born on August 5th, his name day would fall on September 4th, but if we were born after Sep 4th, his name day would be on April 22nd.

We need to find when a given person celebrates his/her name day based on the first name and birthdate.

What I need is an SQL Server query that will be able to get me name days for the people in my database.

We actually need to make the calendar to wrap around: to build a condition that would select 1904, Jan 1st after 1904, Dec 31th.

We can do it using a COALESCE on the simple queries in OUTER APPLY clause.

Let's create sample tables:


Table creation details

For each user, we have to select the first name day that comes after his birthday (his because it was more simple to make all these names male, no offence, gals).

We first need to find out his birthday as of 1904. To do this, we need to subtract the difference between his year of birth and 1904 using DATEADD.

Then, we select the first name day for his name that is more or equal to his birthday.

However, there can be no nameday if the user's birthday is close enough to the end of year. So if it is, we just need to select first name day of the year for the user's first name, and this will be the name day for this user. We will do it in a subquery which will be a second argument to COALESCE function. This subquery will be evaluated only if no name day will be found using the first method.

And here's the query:

SELECT  *
FROM    [20090821_nameday].t_user u
OUTER APPLY
        (
        SELECT  COALESCE(
                (
                SELECT  TOP 1 nameday
                FROM    [20090821_nameday].t_nameday nd
                WHERE   nd.first_name = u.first_name
                        AND nd.nameday >= DATEADD(year, 1904 - YEAR(u.birth_date), u.birth_date)
                ORDER BY
                        nd.nameday
                ),
                (
                SELECT  TOP 1 nameday
                FROM    [20090821_nameday].t_nameday nd
                WHERE   nd.first_name = u.first_name
                ORDER BY
                        nd.nameday
                )
                ) AS nameday
        ) dates

See the query results

One more question: why did I use year 1904 to keep the name day dates instead of 1900 proposed by the author?

This is simple: 1900 was not a leap year, since it's divisible by 100 but not by 400. So the name day that falls on Feb 29th cannot be stored as a DATETIME with year 1900.

1904, on the other hand, was a leap year and had Feb 29th.

Written by Quassnoi

August 21st, 2009 at 11:00 pm

Posted in SQL Server

Leave a Reply