Celebration!! Family time!! Long drive!! A day at the beach!! All these words buzz into our minds when we think of holidays. Have you ever considered how a multinational company keeps track of holidays across the globe? There must be a data dictionary to maintain all these details so that they can ensure seamless business with their local partners.
This article will explain such a data model.
The Project Requirements in a Nutshell
I have quite simple and straightforward requirements this time. I need to build a data dictionary for holidays across many countries. I want to build it as a component that can be integrated into the main data model when and where required.
On Some Interesting Facts About Holidays Across Countries
In terms of its project requirements, this is one of the simplest problems in data modeling. Yet it’s hard enough to design a data model for it. Usually holidays fall on a fixed date every year, but this isn’t the case for every holiday in every country. If we analyze holidays across various countries, we can easily foresee the complications involved in this data model design.
Let’s have a look at some interesting facts about holidays in various countries:
Many holidays, especially patriotic ones, are observed on a fixed date every year.
Independence Day in the USA and India is observed on 4th July and 15th August respectively.
Some holidays are celebrated on a specific day each year – but not always the same calendar date.
Thanksgiving day in the USA is celebrated on the 4th Thursday in November. Last year (2015) this fell on November 26th; this year, it will be November 24th.
Some holidays are celebrated on a fixed date in a year, but if the date falls on Saturday or Sunday, the holiday is purposely shifted to the following Monday to observe a long weekend. Such a holiday is sometimes called ‘Mondayized’.
In Australia and New Zealand, ANZAC Day is celebrated on the 6th of February, but should that date fall on Saturday or Sunday, the holiday is observed on the Monday a day or two later.
Another good example is Labor Day in China. This holiday is also ‘Mondayized’.
The dates of some holidays are shifted by a week if they clash with some other holiday.
Family and Community Day in Australia is celebrated on the first Monday of October, but if Labor Day also falls on the first Monday, then Family Day is shifted to the second Monday in October.
- Not all holidays are observed as bank holidays, i.e., holidays when banks, financial institutes, stock markets, and government offices are closed. (In the USA and Canada, bank holidays are known as federal or statutory holidays.)
- Patriotic holidays are strictly observed on the same date every year. All institutes and offices (including banks) in all regions of the country are closed on that day. However, in some countries, like the USA and Canada, if these holidays fall on a weekend, they will also be observed on the following Monday – that is to say, banks and government offices will be closed on that Monday.
Holidays with the same name are observed on different days in different countries.
Labor Day is observed on May 1 in India, whereas it is observed on the first Monday of September in Canada.
Some holidays day-offs are traditionally bundled with non-holiday days off.
Labor Day in China and South Africa is observed on one day, but two other days off are included.
Other days, while not technically holidays, are customarily allowed as non-work days.
In the USA, the Friday after Thanksgiving is unofficially known as Black Friday. It’s not a government holiday, but many companies give their employees the day off.
Some holidays are observed differently in different regions within a country.
Summer Bank Holiday in the UK is celebrated on the first Monday of August in Scotland, but the same holiday is observed on the last Monday of August in England, Guernsey, Jersey, Northern Ireland, and Wales.
Certain regional or local holidays are observed in just one part of a country. These may be linked to religious, ethnic, or cultural events.
Louis Riel Day is celebrated only in the Canadian province of Manitoba.
Some observance days for certain holidays are based on a ‘before’ or ‘after’ condition.
- National Patriot’s Day is observed in the Canadian province of Quebec on the Monday before the 25th of May.
- Repentance Day in Germany is observed on the Wednesday immediately before November 23.
- Jeune Genevois in Switzerland is observed on the Thursday following the first Sunday of September.
Certain holiday celebration days are based on older calendars that don’t match the commonly-used Gregorian calendar. Therefore, their dates vary every year.
- Easter is celebrated on the first Sunday after the full moon occurring on or soonest after the 21st of March.
- Diwali (an ancient Hindu festival) is celebrated over several days, from the end of the Hindu lunar month of Ashvin and the start of the month of Kartika. Usually, this falls somewhere between mid-October and mid-November in the Gregorian calendar.
- Orthodox Christmas – This follows the older Julian calendar. As of 2016, there is a difference of 13 days between the Julian calendar and the Gregorian calendar. As a result, Orthodox Christmas falls on January 7, 2016.
Summarizing the Facts
It’s important to note that I am considering only the internationally-accepted Gregorian calendar (which follows the solar cycle) for automating the data population for holidays for years and the countries. In this article, I am not considering Lunisolar, Hebrew, or Hindu calendars (which follow the lunar cycle). However, these calendars are being followed in specific regions of the globe. For now, holidays based on these calendars can be fed into the system manually.
To sum up, holidays across countries can be categorized based on how their dates are derived:
- Fixed Holidays – Holidays which occur on a fixed date every year.
- Movable Holidays – Holidays which fall on a specific day, like the first Monday of February or third Thursday of November.
- Adjustable Holidays – Holidays that come under either category, but are sometimes observed on other days to avoid clashing with other celebrations (or clashing with the weekend) or shifted to the next week due to clash with other holiday on the same date.
- Holidays Based on Other Calendars – Holiday observances that are based on the lunar, Orthodox, or Hindu calendar. For now, these are manually fed into our model.
We can further put holidays into two categories based on where they are observed:
- National Holidays – Holidays which are observed at the country level.
- Regional or Local Holidays – Holidays which are observed in a particular state or region of a country.
In almost all countries, national and regional holidays are observed as bank holidays at country or regional level. Not all holidays are bank holidays, though, so we should designate which holidays are bank holidays and which are not.
At this point, we should also consider some theoretical scenarios for specific business areas. For example:
- In some countries, banks and other financial institutions are given a day off on the first day of every quarter.
- Some organizations give a day off after publishing their quarterly results.
We will make sure these points are also included in our data model design.
Designing a Comprehensive Holiday Data Model
While designing the data model, I shall use the US convention that the week begins on Sunday. It will not be too difficult to change this later if needed.
This entire data model will revolve around three subject areas: “Calendar”, “Holiday”, and “Country”.
The “Calendar” Subject Area
In this area, there is a main table named
calendar that stores dates for many years. There will also be some additional columns to store pre-computed numeric values, which will help us derive dates for certain moveable holidays. The columns are as follows:
There are two more tables in this subject area:
As their names suggest, we will store details of individual days and months in these tables. Therefore, they will always have 7 and 12 records respectively. Some things to keep in mind for this section are:
- We can configure the start of the week by means of a sequence column in both of these tables. We can do the same with the start of the year.
- The primary keys of both tables are referred in the
calendartable. They store numeric values for days of the week and months of the year.
- The value of a year can be extracted from the
calendar_datecolumn, but I still keep
calendar_yearas a separate column. This allows us to partition the table on this column, which in turn enables better performance for underlying SQLs.
- The size of number columns has been defined based on possible values for the column. For example, the
day_of_yearmust be some value between 1 and 365, so I define number(3) as the column’s datatype.
The “Holiday” Subject Area
As we said before, there are two types of holidays – fixed and moveable. So we’ll create two different tables, one for each type.
holiday_fixed table uses
month_of_year_id columns to store numeric values for day and month. Using these values, we can derive a date for a fixed holiday.
On similar lines, the
holiday_moveable table will use the following columns to derive a date for each moveable holiday:
is_bank_holiday column signifies whether the holiday is a bank holiday, i.e., all financial institutions are closed on that day. This column is required in both of the tables.
is_mondayized column modifies the date for holidays that fall on a Saturday or Sunday but are observed on the following Monday.
Let’s also create another table, namely
holiday_miscellaneous, to store records for holidays based on non-Gregorian calendars. Records will be inserted into this table manually.
All these three tables have one column referencing the
holiday_category table. This holds data on the nature of the holiday. There can be various categories here, including:
- Public / Bank Holiday – Banks are officially closed and no trading takes place.
- State Holiday – Public holidays at the state-only level.
- National Holiday – Generally a patriotic anniversary or a day defined by law that is celebrated nationwide.
- Local Holiday – Declared by local government, and observed in a specific region only.
- Observance – Holidays which are not celebrated on their actual dates, but on some other day (often Monday). Usually allows people to have a three-day weekend.
You must have noticed the
state_id column in all three holiday tables. Let’s talk about the significance of this column in the next section.
The “Country” Subject Area
We have two tables in this subject area:
country– which stores country names and IDs;
state– which stores state and/or region names and IDs for each individual country.
Eventually, we will refer to this
state table in all three holiday tables to determine which region, state, and country a holiday belongs to.
Since many holidays are celebrated at the country level, it does not make sense to keep state-level records for such holidays in the
holiday table. That would become extremely redundant. Instead, we can have one record in the
state table with ‘ALL’ as a state name. This record can be mapped with all holidays of that country, thus eliminating the need to keep huge records in the
holiday table unnecessarily.
The Final Holiday Data Model
Let’s have a look at the complete holiday data model here:
There are several ways we can play around with this model. For example:
Get a list of all the holidays observed in a particular country, say Poland.
Select hm.holiday_name, calendar_date, hm.is_bank_holiday from calendar c, holiday_moveable hm Where hm.month_of_year_id = c.month_of_year_id and hm.day_of_week_id =c.day_of_week_id and c.calendar_year = 2016 And hm.state_id = (select state_id from state s, country c where s.country_id = c.id and c.country_name = ‘POLAND’ ) UNION ALL Select hf.holiday_name, calendar_date, hf.is_bank_holiday from calendar c, holiday_fixed hm Where hm.month_of_year_id = c.month_of_year_id and hm.day_of_month = to_number(to_char(c.calendar_date,’DD’)) and c.calendar_year = 2016 And hm.state_id = (select state_id from state s, country c where s.country_id = c.id and c.country_name = ‘POLAND’) ;
Find the date for Thanksgiving Day in 2018 – Remember, this is celebrated in all states of the US on the fourth Thursday in November.
Select hm.holiday_name, calendar_date, hm.is_bank_holiday from calendar c, holiday_moveable hm Where hm.month_of_year_id = c.month_of_year_id And hm.day_of_week_id =c.day_of_week_id And c.calendar_year = 2018 And hm.holiday_name = ‘THANKSGIVING’ And hm.state_id = (select state_id from state s, country c where s.country_id = c.id and c.country_name = ‘USA’ )
Get a list of when Independence Day is celebrated in all countries. Usually, this is on a fixed date every year, and the day is strictly observed in all areas of the country.
Select c.country_name, calendar_date from calendar c, holiday_fixed hf, state s, country c Where hf.state_id = s.id and s.country_id = c.id And s.state_name = ‘ALL’ And c.month_of_year_id = hf.month_of_year_id And c.day_of_month = trunc(calendar_date) And hf.holiday_name = ‘INDEPENDENCE DAY’ and c.calendar_year = 2016;
Putting the Holiday Data Model to Use
Would you like to play around with this data model? Go for it. Here are just some of the queries we thought up:
- Find the dates Labor Day is observed in various countries.
- Get a list of all 2016 holidays for every part of the UK.
- Make a list of all bank holidays observed in France in 2016.
- Get the list of all holidays observed in the Canadian province of Manitoba in 2016.
How did you manage to store holiday details in your application? I would love to hear your ideas. Please, feel free to share your experience with storing this metadata as well as your take on our solution.