From Stack Overflow:
The problem is with dynamically getting upcoming what we call namedays
in Europe.
Name day is a tradition in many countries in Europe and Latin America of celebrating on a particular day of the year associated with the one's given name.
Here's a sampe name day table (I store the date in this format: 1900-08-22
, but we really only need month and day):
First name |
Name day |
Bob |
1900-04-22 |
Bob |
1900-09-04 |
Frank |
1900-01-02 |
The trick is that there might be multiple name days for each name, and someones name day is always the first one found after ones birthday.
So if Bob was born on August 5th, his name day would fall on September 4th, but if we were born after Sep 4th, his name day would be on April 22nd.
We need to find when a given person celebrates his/her name day based on the first name and birthdate.
What I need is an SQL Server query that will be able to get me name days for the people in my database.
We actually need to make the calendar to wrap around: to build a condition that would select 1904, Jan 1st after 1904, Dec 31th.
We can do it using a COALESCE
on the simple queries in OUTER APPLY
clause.
Let's create sample tables:
Table creation details
CREATE SCHEMA [20090821_nameday]
CREATE TABLE [20090821_nameday].t_name (
id INT NOT NULL PRIMARY KEY,
first_name VARCHAR(20) NOT NULL UNIQUE
)
CREATE TABLE [20090821_nameday].t_user (
id INT NOT NULL PRIMARY KEY,
first_name VARCHAR(20) NOT NULL,
birth_date DATETIME NOT NULL
)
CREATE TABLE [20090821_nameday].t_nameday (
first_name VARCHAR(20) NOT NULL,
nameday DATETIME NOT NULL,
PRIMARY KEY(first_name, nameday),
CHECK(YEAR(nameday) = 1904)
)
GO
BEGIN TRANSACTION
SELECT RAND(20090821)
DECLARE @cnt INT
SET @cnt = 1
WHILE @cnt <= 30
BEGIN
WITH rows AS
(
SELECT 1 AS letter,
CAST(RAND() * 11 AS INTEGER) + 1 AS consonant,
CAST(RAND() * 5 AS INTEGER) + 1 AS vowel
UNION ALL
SELECT 2 AS letter,
CAST(RAND() * 11 AS INTEGER) + 1 AS consonant,
CAST(RAND() * 5 AS INTEGER) + 1 AS vowel
UNION ALL
SELECT 3 AS letter,
CAST(RAND() * 11 AS INTEGER) + 1 AS consonant,
CAST(RAND() * 5 AS INTEGER) + 1 AS vowel
UNION ALL
SELECT 4 AS letter,
CAST(RAND() * 11 AS INTEGER) + 1 AS consonant,
0 AS vowel
)
INSERT
INTO [20090821_nameday].t_name
SELECT @cnt,
(
SELECT CASE letter WHEN 1 THEN SUBSTRING('BCDGLMNPSTV', consonant, 1) ELSE SUBSTRING('bcdglmnpstv', consonant, 1) END +
CASE letter WHEN 4 THEN 'ius' ELSE SUBSTRING('aeiou', vowel, 1) END AS [text()]
FROM rows
FOR XML PATH('')
)
SET @cnt = @cnt + 1
END
SET @cnt = 0
WHILE @cnt < 450
BEGIN
INSERT
INTO [20090821_nameday].t_nameday
SELECT first_name,
CAST('1904-01-01' AS DATETIME) +
CAST(CAST(366 AS FLOAT) / 15 * (@cnt / 30) AS INTEGER) +
CAST(
RAND() *
(
CAST(CAST(366 AS FLOAT) / 15 * ((@cnt / 30) + 1) AS INTEGER) -
CAST(CAST(366 AS FLOAT) / 15 * (@cnt / 30) AS INTEGER)
) AS INTEGER)
FROM [20090821_nameday].t_name
WHERE id = @cnt % 30 + 1
SET @cnt = @cnt + 1
END
SET @cnt = 0
WHILE @cnt < 30000
BEGIN
INSERT
INTO [20090821_nameday].t_user
SELECT @cnt, first_name,
CAST('1920-01-01' AS DATETIME) + CAST(RAND() * 25000 AS INTEGER)
FROM [20090821_nameday].t_name
WHERE id = @cnt % 30 + 1
SET @cnt = @cnt + 1
END
COMMIT
[/sourcecode]
</div>
Here we have a set of <strong>30</strong> randomly-generated Latin-like names, <strong>15</strong> random name days for each of these names and <strong>30,000</strong> users:
<a href="#" onclick="xcollapse('X4639');return false;"><strong>See the list of funny names</strong></a>
<br />
<div id="X4639" style="display: none; ">
SELECT *
FROM [20090821_nameday].t_name
id |
first_name |
14 |
Bitedudius |
28 |
Debabopius |
20 |
Debaculius |
25 |
Depunonius |
19 |
Donumicius |
2 |
Gemevubius |
9 |
Genocotius |
10 |
Gobesuvius |
4 |
Gotapogius |
24 |
Gunucopius |
29 |
Latudebius |
16 |
Legepimius |
1 |
Levotepius |
15 |
Lomipapius |
30 |
Mebusapius |
26 |
Mipomebius |
21 |
Mudumebius |
23 |
Mumapunius |
11 |
Nuducepius |
3 |
Pasavatius |
17 |
Pelelalius |
6 |
Pogibotius |
18 |
Pusigidius |
27 |
Sevonolius |
5 |
Sugusedius |
7 |
Tadecunius |
13 |
Temupudius |
8 |
Viculegius |
22 |
Visetitius |
12 |
Vopetepius |
For each user, we have to select the first name day that comes after his birthday (his
because it was more simple to make all these names male, no offence, gals).
We first need to find out his birthday as of 1904. To do this, we need to subtract the difference between his year of birth and 1904 using DATEADD
.
Then, we select the first name day for his name that is more or equal to his birthday.
However, there can be no nameday if the user's birthday is close enough to the end of year. So if it is, we just need to select first name day of the year for the user's first name, and this will be the name day for this user. We will do it in a subquery which will be a second argument to COALESCE
function. This subquery will be evaluated only if no name day will be found using the first method.
And here's the query:
SELECT *
FROM [20090821_nameday].t_user u
OUTER APPLY
(
SELECT COALESCE(
(
SELECT TOP 1 nameday
FROM [20090821_nameday].t_nameday nd
WHERE nd.first_name = u.first_name
AND nd.nameday >= DATEADD(year, 1904 - YEAR(u.birth_date), u.birth_date)
ORDER BY
nd.nameday
),
(
SELECT TOP 1 nameday
FROM [20090821_nameday].t_nameday nd
WHERE nd.first_name = u.first_name
ORDER BY
nd.nameday
)
) AS nameday
) dates
See the query results
id |
first_name |
birth_date |
nameday |
0 |
Levotepius |
1968-11-12 00:00:00.000 |
1904-11-21 00:00:00.000 |
1 |
Gemevubius |
1941-07-29 00:00:00.000 |
1904-08-30 00:00:00.000 |
2 |
Pasavatius |
1928-03-01 00:00:00.000 |
1904-03-06 00:00:00.000 |
|
29997 |
Debabopius |
1940-05-21 00:00:00.000 |
1904-05-28 00:00:00.000 |
29998 |
Latudebius |
1931-07-26 00:00:00.000 |
1904-08-16 00:00:00.000 |
29999 |
Mebusapius |
1936-06-17 00:00:00.000 |
1904-06-29 00:00:00.000 |
30000 rows fetched in 1.2157s (0.0030s) |
Table 't_nameday'. Scan count 60000, logical reads 120383, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't_user'. Scan count 1, logical reads 184, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 500 ms, elapsed time = 1113 ms.
One more question: why did I use year 1904 to keep the name day dates instead of 1900 proposed by the author?
This is simple: 1900 was not a leap year, since it's divisible by 100 but not by 400. So the name day that falls on Feb 29th cannot be stored as a DATETIME
with year 1900.
1904, on the other hand, was a leap year and had Feb 29th.