Posted on July 29, 2013
Adding a quick lunar calendar to the data mart
I ran across an article at Popular Science citing research that correlated Heart Surgery outcomes to the Lunar Calendar. That’s right… the claim is that you are more likely to survive and to spend less time in the hospital recovering from heart surgery if you get it during a waning full moon. We know the moon can affect us in ways that make sense… for example an article popped up this week about the full moon disturbing sleep — it’s brighter, so it’s harder to sleep. Ok. And crime increases during full moons could make some sense too due to increased night visibility or such. But medical outcomes?
Well, this gives us something new to play with, so let’s find out! First, I’m working on the data warehouse I’ve been working on for the last few months (wow, almost a year). It has a well behaved date dimension table, so we’re going to add a phase-of-the-moon calculation. The basic logic for doing that is:
- Lunar month is 29.530588853 days
- Jan 1, 1900 at 13:52 PM GMT was a full moon
- There are 9 indicators of the Lunar cycle, but it basically has 4 phases:
- Waxing Crescent when Lunar month is in the first quarter (29.530588853/4)
- Waxing Gibbous when Lunar month is in the second quarter
- Waning Gibbous when Lunar month is in the third quarter
- Waning Crescent when Lunar month is in the last quarter ( > 29.530588853*3/4)
- The other five are specific indicators, which we will round to the nearest day:
- New Moon is day 1
- Dark Moon is day > 28.53
- First Quarter is day between 6.88 and 7.88 (29.530588853 / 4 +- 0.5)
- Full Moon is day between 14.265 and 15.265 (29.530588853/2 +- 0.5)
- Last Quarter is day between 21.648 and 22.648 (29.530588853 *3 / 4 +- 0.5)
There were other variations available, and really this calculation could be done on-the-fly, but this was fun enough. Here’s some SQL… note the 13:52 HH:MM is probably unnecessary but, why not?:
update DD set MOON_PHASE = CASE WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 <= 1 THEN 'New Moon' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 1 and 6.88 THEN 'Waxing Crescent' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 6.88 and 7.88 THEN 'First Quarter' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 7.88 and 14.265 THEN 'Waxing Gibbous' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 14.265 and 15.265 THEN 'Full Moon' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 15.265 and 21.648 THEN 'Waning Gibbous' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 21.648 and 22.648 THEN 'Last Quarter' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 BETWEEN 22.648 and 28.53 THEN 'Waning Crescent' WHEN DATEDIFF(DAY,'1900-01-01 13:52:00.000',PK_DATE) % 29.530588853 >= 28.53 THEN 'Dark Moon' ELSE 'Apocalypse' END from Dimensions.D_DATE DD
So now everything that we have tied to a specific date (date of surgery, birth date, etc.) can now be tied to a lunar cycle! Great! So now what?
The Data Warehouse this is on top of has several million claims over the years of Medical, Dental, Vision, Mental Health, and other categories. Since I can’t publish any real information due to privacy and proprietary restrictions, I’ve had to dumb this down substantially, but, at least for my very-unscientific data-set, I can absolutely verify a spike in mental and behavioral health claims after a full moon:
The chart shows the relative percent of payments per claim, to adjust for the relative length of the phases (since the New Moon, First Quarter, Last Quarter, and Dark Moon are one day long each and the rest are roughly 6 days each). Immediately after the full moon, the average price per claim for Behavioral Health spikes to 151% of average. Again, this is wildly unscientific… any number of rationale could exist other than a true impact on human behavior, but we certainly can’t rule that OUT in this data.
I was NOT able to replicate the results of the cardiothoracic surgery study, but this was a really fun exercise that didn’t take long at all.
Just for more fun, I did something similar with Zodiac Sign… I’ll leave the actual analysis up to you, but here’s some code that will help generate the zodiac signs in your date dimension (Transact SQL syntax, but that’s easily repaired… thanks to http://www.midnightdba.com/Jen/2011/07/meme-monday-sqlstrology-table-valued-zodiac-function/):
DECLARE @SQLStrology TABLE ( ZodiacSign VARCHAR(20), STARTMMDD VARCHAR(20), ENDMMDD VARCHAR(20), QuickCharacteristics NVARCHAR(400)); INSERT INTO @SQLStrology (ZodiacSign, STARTMMDD, ENDMMDD, QuickCharacteristics) VALUES ('Aries', '03/21', '04/19', 'Active, Demanding, Determined, Effective, Ambitious'), ('Taurus', '04/20', '05/20', 'Security, Subtle strength, Appreciation, Instruction, Patience'), ('Gemini', '05/21', '06/20', 'Communication, Indecision, Inquisitive, Intelligent, Changeable'), ('Cancer', '06/21', '07/22', 'Emotion, Diplomatic, Intensity, Impulsive, Selective'), ('Leo', '07/23', '08/22', 'Ruling, Warmth, Generosity, Faithful, Initiative'), ('Virgo', '08/23', '09/22', 'Analyzing, Practical, Reflective, Observation, Thoughtful'), ('Libra', '09/23', '10/22', 'Balance, Justice, Truth, Beauty, Perfection'), ('Scorpio', '10/23', '11/21', 'Transient, Self-Willed, Purposeful, Unyielding'), ('Sagittarius', '11/22', '12/21', 'Philosophical, Motion, Experimentation, Optimism'), ('Capricorn', '12/22', '12/31', 'Determination, Dominance, Perservering, Practical, Willful'), ('Capricorn', '01/01', '01/19', 'Determination, Dominance, Perservering, Practical, Willful'), -- Split Capricorn to manage year boundaries ('Aquarius', '01/20', '02/18', 'Knowledge, Humanitarian, Serious, Insightful, Duplicitous'), ('Pisces', '02/19', '03/20', 'Fluctuation, Depth, Imagination, Reactive, Indecisive'); UPDATE DD SET ZodiacSign = SS.ZodiacSign FROM Dimensions.D_DATE DD inner join @SQLStrology SS on right(replicate('0',2)+CAST(MONTH(PK_DATE) AS varchar(2)),2) + '/' + right(replicate('0',2)+CAST(DAY(PK_DATE) AS VARCHAR(2)),2) BETWEEN SS.STARTMMDD AND SS.ENDMMDD