In this article, I’ll walk you through some fundamental considerations for working with date- and time-related data in MySQL. We’ll also look at how to handle multiple time zones and daylight saving time changes.
Let’s first address some core concepts that will help us understand the underlying complexity of time-related data. It is important to notice that these concepts apply when representing a point in time rather than an absolute duration.
- So what’s the big deal?
- Does this data represent a point in time in the past or in the future?
- How is this data going to be displayed?
- Is the information going to be sorted or filtered by its local time or by its UTC representation?
- What is meant by “today”?
- Time zone manipulation in MySQL
Before we talk about using date and time data in MySQL, let’s review how time zones work. We’ll start with the standard time reference.
The Standard Time Reference: UTC
Since time needs to be known and coordinated across the entire world, Coordinated Universal Time (UTC) is used as the standard absolute time reference. UTC evolved from Greenwich Mean Time (GMT) for use as a standard and is maintained by the Bureau International des Poids et Mesures in Paris.
UTC is obtained by applying a set of algorithms to about 250 cesium clocks and hydrogen masers in roughly 65 different laboratories worldwide. Real-time estimates of UTC are computed and provided by different centers, such as the United States Naval Observatory and the National Institute of Standards and Technology.
From this reference point, each time zone derives a local time that is represented as an offset from UTC. For instance, the Central European Time Zone is defined as UTC+1:00. So the local time for this region would be the current UTC time plus one hour.
Is UTC the Same as GMT?
When the original time reference had to be defined, it was agreed to make it relative to a certain point on earth. Greenwich, a borough of London that is home to the prime meridian (0o longitude) was chosen as this point. So noon GMT was the moment when the sun was at its highest point above the Greenwich Meridian. Nowadays, GMT is currently used as a time zone that has no offset from UTC. UTC is a time reference; GMT is a time zone. No place on earth uses UTC as an official time zone; GMT is used as a time zone in the UK and in parts of Africa.
Practically speaking, GMT time is the equivalent of UTC time (GMT = UTC+0). Strictly speaking, though, the current time reference is not GMT but UTC.
Time Zones and Local Time
A time zone is a region of the globe that observes a uniform standard time. The time zone is basically an offset of UTC, but it may also be subject to an additional change during Daylight Saving Time:
Daylight Saving Time (DST), also known as summer time, is observed by advancing clocks one hour in the spring and then reverting to standard time in the autumn. It was originally meant to make better use of natural daylight and reduce electricity consumption
Each zone, country, or region may have a time zone that changes over time (e.g. for geopolitical reasons) and IANA keeps track of this in a public database used in many software systems. In this case, time zones are commonly named after a regional/location pattern like America/Indiana/Knox, Australia/Melbourne, etc.
The local time is calculated by applying the regional time zone offset to the UTC time. For instance, to calculate the local time in Buenos Aires, Argentina, I apply the offset for Argentina to the current UTC time. If UTC is 16:19 and America/Buenos_Aires is UTC-3, my local time is 13:19.
It’s worth noting that since UTC is the absolute time reference, it is not affected by DST. What is affected by DST is the local offset.
Time Zones and Time Offsets Are Not the Same Thing
This is because offsets for a given time zone may change. Let’s take Los Angeles, California as an example. Los Angeles is in the IANA time zone of America/Los_Angeles. It has a UTC-7 offset during DST, but a UTC-8 offset for standard time. The offset changes, but the time zone boundaries are the same. However, other places, such as the state of Arizona, don’t follow DST; their offset does not change.
So What’s the Big Deal?
When storing time and date data, the level of complexity strongly depends on what you need to do with it in terms of retrieval, manipulation, and front-end display. Before you start designing anything, ask yourself:
Does this data represent a point in time in the past or in the future?
If it represents a point in the past, any offset for the time zone already happened and is already known. For future points, this offset could alter – DST could come into effect or even change.
Say that you want to save a future summer date and time for Los Angeles, like noon on April 13, 2060 (2060-04-13 19:00 UTC). According to current projections, DST starts at 02:00 that day, so this date and time has an offset of -7hs due to DTS.
But, let’s imagine that next year (2017), they decide to start DST one day later in California (yes DST itself is not fixed and may change over time). So if next year I had to pick the same point in time (2060-04-13 19:00 UTC), the actual local time would be 2060-04-13 11:00 since DST wouldn’t have kicked in yet.
To sum up, if you’d already stored noon on April 13, 2060 as a local date, it would not represent the intended future point in time because it was originally generated with an offset of -7hs instead of -8hs.
How is this data going to be displayed?
Sometimes you need to show the date and time in the user’s local time. Or you may need to display it in the original local time or as UTC time. You may also need to clearly state the actual time zone for that specific date. So this will affect how you store your data since, depending on the requirement, you may need to know the proper time zone.
Is the information going to be sorted or filtered by its local time or by its UTC representation?
Depending on your use case, you may need to sort and display the information based on the chronological order of events. Therefore the UTC representation, which is the actual point in time, would be the proper way to sort this data.
What if the requirement is to sort the data based on the original local time? Then sorting by UTC wouldn’t work. This also holds good for sorting by local time, such as showing movies that play on a specific local date like 2016-12-01. If you filtered the results using this date as-is and compared it against UTC times instead of the local time, then a movie that starts in Los Angeles at 2016-12-01 23:00 local time would get filtered out, as the UTC representation is 2016-12-02 07:00 UTC.
What is meant by “today”?
Like the previous point, this question (and all its relations, like next week, tomorrow, this year, etc.) will depend on your application. Say that the current user wants machine logs from machines around the globe and the interface has a button to “get logs from: today”. The local time is 2016-10-07 16:00 America/Buenos_Aires (UTC-3). What is “today” for this user? It could be:
“Based on the user’s local time”
This would be from 2016-10-07 00:00 to 2016-10-07 23:59:59 in his local time, which translated to UTC time, includes all records from 2016-10-07 03:00 to 2016-10-08 02:59:59 UTC. (Notice the date and time shift in the UTC representation).
“UTC based on user’s date segment”
In the example, the current user’s date would be 2016-10-07 in his local time. If we use that as the UTC date part, “today” would include all records that range from 2016-10-07 00:00 to 2016-10-07 23:59:59 in UTC. (Compared with the former example, this range has no date shift.)
“Records with a local date segment equal to user’s date segment”
In our example, the current user’s date would be 2016-10-07 in his local time. So, we use that part only and get whatever log has been generated in the range of 2016-10-07 00:00 to 2016-10-07 23:59:59 but in the record’s local time.
So if a record in another part of the globe was generated at 2016-10-07 23:00:00 (UTC-11), which is 2016-10-08 10:00:00 in UTC, it should be retrieved. (Notice that this record would not have been retrieved in the other two examples.)
Time Zone Manipulation in MySQL
All of this imposes the need to manipulate date data to fit application needs. It is possible to manipulate time zones in MySQL by using the
CONVERT_TZ() function. From MySQL’s documentation:
CONVERT_TZ(dt,from_tz,to_tz)converts a datetime value
dtfrom the time zone given by
from_tzto the time zone given by
to_tzand returns the resulting value (...) This function returns
NULLif the arguments are invalid.
This can be used in two ways – by indicating the time zone or by explicitly passing the time offset as the parameter, as you can see in the following examples:
mysql> SELECT CONVERT_TZ('2004-01-01 12:00:00','GMT','Europe/Moscow'); -> '2004-01-01 13:00:00'
mysql> SELECT CONVERT_TZ('2004-01-01 12:00:00','+00:00','+10:00'); -> '2004-01-01 22:00:00'
To use this function with time zones like “Europe/Moscow” you need to set up and maintain the time zone records by yourself, as stated in the MySQL documentation:
The MySQL installation procedure creates the time zone tables, but does not load them. You must do so manually using the following instructions. Loading the time zone information is not necessarily a one-time operation because the information changes occasionally. When such changes occur, applications that use the old rules become out of date and you may find it necessary to reload the time zone tables to keep the information used by your MySQL server current.
For small teams, individual developers, and on some shared hostings, it can be challenging or even impossible to keep this working properly. For this reason, I propose an alternative approach.
Getting the current time in MySQL
Before we look at the alternative approach, let’s consider how we get the current time in MySQL.
Several date functions address time. By default, MySQL uses the local system time zone. Therefore calling the function NOW() will retrieve the current time in whatever time zone the server is set. But if you want to be certain and consistent about the results, you may be better off using
UTC_TIMESTAMP() instead. From the official docs:
UTC_TIMESTAMP Returns the current UTC date and time as a value in 'YYYY-MM-DD HH:MM:SS' or YYYYMMDDHHMMSS format, depending on whether the function is used in a string or numeric context. If the fsp argument is given to specify a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits.
mysql> SELECT UTC_TIMESTAMP(), UTC_TIMESTAMP() + 0; -> '2003-08-14 18:08:04', 20030814180804.000000
I would recommend storing MySQL dates and times as a DATETIME column (read this to understand why) and storing the time zone (like “America/Los_Angeles”) in another VARCHAR column. Note that I am NOT recommending that you store the UTC offset; this is because of the DST changes explained earlier.
Above all, store the UTC time in the database. This is the most accurate representation of a point in time you can have. Since you also have the time zone column, you will always have the ability to “reconstruct” the original local date/time, either via the MySQL CONVERT_TZ() function or from your application end, as illustrated by the following PHP code snippet:
$utc_time = time retrieved from the database $timezone = timezone retrieved from the database $date_obj = new DateTime($utc_time, new DateTimeZone("UTC")); $date_obj->setTimeZone($timezone); $display_date = $date_obj->format("Y-m-d H:i:s");
This will also yield efficient queries that can make use of UTC ranges and sorting.
One important caveat is that if you cannot use CONVERT_TZ to compare or sort against the saved records’ local time (as explained in Records with a local date segment equal to user’s date segment), you could also store the date and time in its original local time. You may also want to do this to save converting the UTC time back to local time. However if you opt for this approach, keep in mind that stored local times that belong to the future may be wrong, as explained above.
A complete generic approach would be to have two DATETIME columns to store the
utc_datetime and the
local_datetime, while a VARCHAR column stores the IANA
time_zone string (the longest one at the moment is “America/Argentina/ComodRivadavia”, which has 32 characters).
For instance, a basic log table that stores data from servers around the planet could be modeled like this:
created_on fields store the UTC and local times when the log was generated, and the
log_tz field stores the local time zone.
This approach solves many problems that can arise from all the different use cases we’ve covered. It also makes it easier to deal with possible changes in the future. If you have some experiences in this area to share, please feel free to post them in the comments below!