Relativity of time – shortcomings in Python datetime, and workaround

Recently I found out that the standard library support for date and time calculations in Python is not quite as able as I needed. It turned out that the superficial leanness and simplicity of Python’s datetime module bit hard back sooner than you expected. Unfortunately, looking for replacements, I found out that the existing replacement modules have some issues on their own. This blog entry highlights various problems with the current Python datetime implementation. A partial solution will be offered, too.

Basics of time zones

Time zones are a relatively new invention in the long history of measuring time. During most of the 19th century pretty much each European town had its own definition of local time. It was not until 1880 that Greenwich Mean Time was officially made the standard time in the Great Britain; much of the remaining world had adopted the idea by the 1920s. Today, all countries in the world use standard time zones, though not every one is using full-hour offsets to the GMT as it was originally conceived.

The concept of summer time (daylight saving time in AmE) complicates things further: for example in European Union the member states will switch to summer time on the last Sunday of March at 01:00 GMT exact. The summer time lasts until the last Sunday of October, 01:00 GMT. In Finland, this means that this year on 30th March the official time stepped from 02:59:59 EET to 04:00:00 EEST in an instant. Likewise, on 26th October this year, the summer time clocks will tick up to 03:59:59 EEST, and on the next second the local time will be 03:00:00 EET; and almost a hour later, 03:59:59 EET. Thus, the number of seconds between 02:59:59 and 04:00:00 on a single day might be 1, 3601, or 7201; the difference between 02:59:59 and 03:00:01 might likewise be 2 or 3602 seconds… or even undefined.

To alleviate obvious confusions and misunderstandings, a reference time scale can be used for calculations that concern different time zones. The obvious choice is Coordinated Universal Time (UTC) that replaced Greenwich Mean Time as the standard reference time scale for civilian applications in 1972. During the Internet era UTC has become increasingly important.

Time zones in Python – welcome to hell

Suppose you have a shared web calendar application that is used by people all over the world. Each user wants to view the calendar in their respective local time, and you wish to use UTC on the server. The server has been set up with Europe/Helsinki as the local timezone. And you wish to use the facilities provided by the Python standard library modules. Simple date arithmetic would be needed – what could possibly go wrong? You will soon find out that it is not at all simple. Actually it is annoyingly complicated:

>>> from datetime import datetime
>>> dt = datetime.now()
>>> dt
datetime.datetime(2008, 6, 19, 14, 51, 41, 296552)
>>> # ok, it prints the local time. Let's try to
>>> # convert it to UTC time...
>>> dt.utctimetuple()
(2008, 6, 19, 14, 51, 41, 3, 171, 0)
>>> # wait, ahem... 14:51:41... that can't be right...
>>> # the docs say: utctimetuple(...)
>>> #     Return UTC time tuple, compatible with time.localtime().
>>> #
>>> # ok.. so UTC time tuple, compatible with localtime...
>>> # WTF?? my local time zone is not UTC... strangely enough
>>> # the last field in the tuple, "is_dst", is 0, or false...
>>>># I thought June was in summer...
>>>
>>> # Ok, the factory method I need seems to be utcnow
>>> # - that way I can get the time in UTC?)
>>> datetime.utcnow()
datetime.datetime(2008, 6, 19, 11, 59, 9, 750844)
>>> # fair enough, UTC time.

>>> # Let's try simple date arithmetic: the difference
>>> # between now... and now...
>>> datetime.now() - datetime.utcnow()
datetime.timedelta(0, 10799, 999984)
>>> # Hmm... now did that statement really
>>> # take 3 hours to execute?

The reason for these anomalies is that without any time zone information, instances of the datetime class behave as if they stored time in UTC. For our purposes this is unacceptable: if a user of the hypothetical calendar application proposes a meeting 2 hours from now, be it 17:15 EEST or 14:15 UTC, meeting.start - datetime.now() should on this very moment result in 2 hours regardless of the time zone of the user asking it.

While there are several freely available Python modules that provide date and time calculations, like Zope’s DateTime, the problem with them is that none of them is really compatible with datetime interface – if you use code that expects datetime instances, Zope’s DateTime objects will not help you. Some of the replacement modules, like mxDateUtil seems to use dubious date arithmetic, and are not really useful either. Clearly, we have to either fix the python datetime class somehow, or provide a compatible implementation that works as expected.

Fixing datetime

Fortunately, Python datetimes can be made time zone aware, by supplying an instance of tzinfo in the constructor. Unfortunately enough, the Python standard library does not provide any concrete implementations. Dang! Enters: pytz, a Python library that supplies hundreds of concrete time zone definitions.

>>> import pytz
>>> eurhel = pytz.timezone("Europe/Helsinki")
>>> localt = datetime.now(eurhel)
>>> utct = datetime.now(pytz.utc)
>>> utct - localt
datetime.timedelta(0, 0, 3410)

Works as expected. And, utct – datetime.utcnow() fails with “TypeError: can’t subtract offset-naive and offset-aware datetimes” – which is good, as it would not yield sensible results. However, a look under the hood reveals that something is fundamentally wrong:

>>> datetime.datetime.now()
datetime.datetime(2008, 6, 23, 18, 2, 31, 101025,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel)
datetime.datetime(2008, 6, 1, 18, 0, tzinfo=<DstTzInfo 'Europe/Helsinki' HMT+1:40:00 STD>)
>>> # after a minute...
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel) - datetime.datetime.now(eurhel)
datetime.timedelta(0, 4687, 688091)

That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.

The solution offered by pytz pages is to use the normalize and localize methods of pytz tzinfo instances, however this renders the whole datetime system too cumbersome to use. As I wanted to use datetime objects with time zones as easily as possible, I had to subclass the python datetime implementation and hack some internal aspects of it. The module, fixed_datetime also contains a method, set_default_timezone, to allow mimicking of the naive datetime objects; unlike ordinary datetime objects, fixed_datetime.datetime objects are never ‘naive’, but many of the methods will default to the time zone set by the said method.

>>> import fixed_datetime

>>> # set default timezone...
>>> fixed_datetime.set_default_timezone("Europe/Helsinki")

>>> # uses default timezone...
>>> fixed_datetime.datetime.now()
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # also works correctly
>>> fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486)
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # UTC timestamps returned with UTC tzinfo
>>> fixed_datetime.datetime.utcnow()
fixed_datetime.datetime(2008, 6, 23, 15, 37, 44, 777729, tzinfo=<UTC>)

>>> # subtraction still works correctly!
>>> utcdt = fixed_datetime.datetime.utcnow()
>>> heldt = fixed_datetime.datetime.now()
>>> heldt - utcdt
datetime.timedelta(0, 5, 495702)

As a bonus, fixed_datetime.datetime contains methods to convert datetimes from ISO 8601 format. The method support parsing the time zone field, too:

>>> fixed_datetime.datetime.fromisoformat("20081010T010203+0500")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC+05:00>)

>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01:02:03Z")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC>)

>>> # fractional hours, decimal comma, odd timezone
>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01,0341666667-04:37")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3,
        tzinfo=<UTC-04:37>)

>>> fixed_datetime.datetime.today().isoformat(' ')
'2008-06-23 18:54:32+03:00'

>>> # isoformat supports short format, too
>>> fixed_datetime.datetime.now().isoformat(short=True)
'20080623T185303.489792+0300'

>>> # addition across DST boundary works as expected:
>>> before = fixed_datetime.datetime(2008, 10, 26, 2, 0, 0)
>>> before
fixed_datetime.datetime(2008, 10, 26, 2, 0, tzinfo=
        <DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # now, add 2 hours
>>> before + fixed_datetime.timedelta(seconds=7200)
fixed_datetime.datetime(2008, 10, 26, 3, 0, tzinfo=
        <DstTzInfo 'Europe/Helsinki' EET+2:00:00 STD>)

You can download the said module below.

Remaining issues

Not every remaining issue is solved. Fixed datetime still does not accept “24” as hour value (mandated by ISO standard), and will throw an exception on positive leap seconds. Fixed datetime is much slower than the python implementation – many of the operations need to create intermediate 2 or 3 datetime instances.

Sadly it seems that Java got it right: having one class (Date) that stores times in UTC seconds relative to Unix Epoch, and subclasses of abstract Calendar class that deal with getting and setting individual components and date arithmetic in a localized way would indeed be the best long-term solution. To some Java’s date and calendar handling would seem overly complicated, to me it is the simplest way of representing the complex world of different calendars, time zones and other aspects of time keeping. If only someone could persuade Python devs to add something similar to the standard library…

Download

Download fixed_datetime.py, released under 3-clause BSD license.

\"\" Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

15 thoughts on “Relativity of time – shortcomings in Python datetime, and workaround

  1. “If only someone could persuade Python devs to add something similar to the standard library…”

    Have you brought this up on the python or python dev list? It certainly sounds like something they would (or should) consider …

  2. thanks for the article, and the fixed_datetime. I too was so tired of Python’s datetime class lacking proper time zone support.

  3. Bravo. Great blog on an annoying topic. Thanks for the fix too.

    Not so sure that the Java guys got it right first time. Count the number of deprecated methods in class Date. Also, I like it when object names (Class names) represent “real” things. What on earth is an instance of GregorianCalendar. I thought that would be a singleton: there is only one GregorianCalendar. In this context the Joda package is worth a look.

    Thanks again for the very helpful blog

  4. Pingback: chobas.com’s blog » Blog Archive » Labor Day

  5. Also not sure if the Java guys got it right, although they didn’t get it as hilariously wrong as Python. What we really need is a Python equivalent for Java’s jodatime libraries – which are about as good as it gets in terms of date handling libraries.

    I think what bothers me is that most responses to questions about timezone handling in Python seem to brush away the issue – the worst case being that the documentation states that UTC is equivalent to GMT.

  6. Python is a serious disaster when it comes to anything to do with time, dates, and timezones.

  7. Thanks. Fixed the link.

    Nice that someone finds four years old information still useful 🙂 Tells how little Python world has gone forward…

  8. “That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.”

    From pytz document: DO NOT use
    datetime(2012,12,31,12,0,0, tzinfo=pytz.timezone(‘Europe/Helsinki’))

    instead use
    datetime(2012,12,31,12,0,0, tzinfo=pytz.utc).astimezone(pytz.timezone(‘Europe/Helsinki’))

    You can also test with
    >>> datetime(2012,12,31,12,0,0, tzinfo=pytz.utc).astimezone(pytz.timezone(‘Europe/Helsinki’))
    datetime.datetime(2012, 12, 31, 14, 0, tzinfo=)
    >>> datetime(1912,12,31,12,0,0, tzinfo=pytz.utc).astimezone(pytz.timezone(‘Europe/Helsinki’))
    datetime.datetime(1912, 12, 31, 13, 40, tzinfo=)

    “The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans.” From pytz document, again.

    Hope it helps!

  9. @Truong Sinh, you are partially correct. However, when writing software for users, one also needs to accept user input! If a user that you know is in Finland, enters a date “2012-07-12 09:55” you need to know exactly when it is, it is not acceptable to just use UTC!

  10. Also, this entry was written by me, not Mikko (prkl!) 😀

  11. The way I understand the pytz documentation, you would use localize when reading input and astimezone() to display it, such as:

    Read input from someone in Helsinki:

    loc_dt = eurhel.localize(datetime(2013, 5, 29, 12, 0, 0))

    Then store as UTC (only if you want to store date/times for later retrieval):

    utc_dt = loc_dt.astimezone(pytz.utc)

    And display back with whatever time zone you need:

    ldn_dt = utc_dz.astimezone(timezone(“Europe/London”))

Leave a Reply

Your email address will not be published. Required fields are marked *