Relativity of time – shortcomings in Python datetime, and workaround

Recently I found out that the standard library support for date and time calculations in Python is not quite as able as I needed. It turned out that the superficial leanness and simplicity of Python’s datetime module bit hard back sooner than you expected. Unfortunately, looking for replacements, I found out that the existing replacement modules have some issues on their own. This blog entry highlights various problems with the current Python datetime implementation. A partial solution will be offered, too.

Basics of time zones

Time zones are a relatively new invention in the long history of measuring time. During most of the 19th century pretty much each European town had its own definition of local time. It was not until 1880 that Greenwich Mean Time was officially made the standard time in the Great Britain; much of the remaining world had adopted the idea by the 1920s. Today, all countries in the world use standard time zones, though not every one is using full-hour offsets to the GMT as it was originally conceived.

The concept of summer time (daylight saving time in AmE) complicates things further: for example in European Union the member states will switch to summer time on the last Sunday of March at 01:00 GMT exact. The summer time lasts until the last Sunday of October, 01:00 GMT. In Finland, this means that this year on 30th March the official time stepped from 02:59:59 EET to 04:00:00 EEST in an instant. Likewise, on 26th October this year, the summer time clocks will tick up to 03:59:59 EEST, and on the next second the local time will be 03:00:00 EET; and almost a hour later, 03:59:59 EET. Thus, the number of seconds between 02:59:59 and 04:00:00 on a single day might be 1, 3601, or 7201; the difference between 02:59:59 and 03:00:01 might likewise be 2 or 3602 seconds… or even undefined.

To alleviate obvious confusions and misunderstandings, a reference time scale can be used for calculations that concern different time zones. The obvious choice is Coordinated Universal Time (UTC) that replaced Greenwich Mean Time as the standard reference time scale for civilian applications in 1972. During the Internet era UTC has become increasingly important.

Time zones in Python – welcome to hell

Suppose you have a shared web calendar application that is used by people all over the world. Each user wants to view the calendar in their respective local time, and you wish to use UTC on the server. The server has been set up with Europe/Helsinki as the local timezone. And you wish to use the facilities provided by the Python standard library modules. Simple date arithmetic would be needed – what could possibly go wrong? You will soon find out that it is not at all simple. Actually it is annoyingly complicated:

>>> from datetime import datetime
>>> dt = datetime.now()
>>> dt
datetime.datetime(2008, 6, 19, 14, 51, 41, 296552)
>>> # ok, it prints the local time. Let's try to
>>> # convert it to UTC time...
>>> dt.utctimetuple()
(2008, 6, 19, 14, 51, 41, 3, 171, 0)
>>> # wait, ahem... 14:51:41... that can't be right...
>>> # the docs say: utctimetuple(...)
>>> #     Return UTC time tuple, compatible with time.localtime().
>>> #
>>> # ok.. so UTC time tuple, compatible with localtime...
>>> # WTF?? my local time zone is not UTC... strangely enough
>>> # the last field in the tuple, "is_dst", is 0, or false...
>>>># I thought June was in summer...
>>>
>>> # Ok, the factory method I need seems to be utcnow
>>> # - that way I can get the time in UTC?)
>>> datetime.utcnow()
datetime.datetime(2008, 6, 19, 11, 59, 9, 750844)
>>> # fair enough, UTC time.

>>> # Let's try simple date arithmetic: the difference
>>> # between now... and now...
>>> datetime.now() - datetime.utcnow()
datetime.timedelta(0, 10799, 999984)
>>> # Hmm... now did that statement really
>>> # take 3 hours to execute?

The reason for these anomalies is that without any time zone information, instances of the datetime class behave as if they stored time in UTC. For our purposes this is unacceptable: if a user of the hypothetical calendar application proposes a meeting 2 hours from now, be it 17:15 EEST or 14:15 UTC, meeting.start - datetime.now() should on this very moment result in 2 hours regardless of the time zone of the user asking it.

While there are several freely available Python modules that provide date and time calculations, like Zope’s DateTime, the problem with them is that none of them is really compatible with datetime interface – if you use code that expects datetime instances, Zope’s DateTime objects will not help you. Some of the replacement modules, like mxDateUtil seems to use dubious date arithmetic, and are not really useful either. Clearly, we have to either fix the python datetime class somehow, or provide a compatible implementation that works as expected.

Fixing datetime

Fortunately, Python datetimes can be made time zone aware, by supplying an instance of tzinfo in the constructor. Unfortunately enough, the Python standard library does not provide any concrete implementations. Dang! Enters: pytz, a Python library that supplies hundreds of concrete time zone definitions.

>>> import pytz
>>> eurhel = pytz.timezone("Europe/Helsinki")
>>> localt = datetime.now(eurhel)
>>> utct = datetime.now(pytz.utc)
>>> utct - localt
datetime.timedelta(0, 0, 3410)

Works as expected. And, utct – datetime.utcnow() fails with “TypeError: can’t subtract offset-naive and offset-aware datetimes” – which is good, as it would not yield sensible results. However, a look under the hood reveals that something is fundamentally wrong:

>>> datetime.datetime.now()
datetime.datetime(2008, 6, 23, 18, 2, 31, 101025,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel)
datetime.datetime(2008, 6, 1, 18, 0, tzinfo=<DstTzInfo 'Europe/Helsinki' HMT+1:40:00 STD>)
>>> # after a minute...
>>> datetime.datetime(2008, 6, 23, 18, 2, 31, 101025, eurhel) - datetime.datetime.now(eurhel)
datetime.timedelta(0, 4687, 688091)

That’s right, the datetime object created by a call to datetime.datetime constructor now seems to think that Finland uses the ancient “Helsinki Mean Time” which was obsoleted in the 1920s. The reason for this behaviour is clearly documented on the pytz page: it seems the Python datetime implementation never asks the tzinfo object what the offset to UTC on the given date would be. And without knowing it pytz seems to default to the first historical definition. Now, some of you fellow readers could insist on the problem going away simply by defaulting to the latest time zone definition. However, the problem would still persist: For example, Venezuela switched to GMT-04:30 on 9th December, 2007, causing the datetime objects representing dates either before, or after the change to become invalid.

The solution offered by pytz pages is to use the normalize and localize methods of pytz tzinfo instances, however this renders the whole datetime system too cumbersome to use. As I wanted to use datetime objects with time zones as easily as possible, I had to subclass the python datetime implementation and hack some internal aspects of it. The module, fixed_datetime also contains a method, set_default_timezone, to allow mimicking of the naive datetime objects; unlike ordinary datetime objects, fixed_datetime.datetime objects are never ‘naive’, but many of the methods will default to the time zone set by the said method.

>>> import fixed_datetime

>>> # set default timezone...
>>> fixed_datetime.set_default_timezone("Europe/Helsinki")

>>> # uses default timezone...
>>> fixed_datetime.datetime.now()
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # also works correctly
>>> fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486)
fixed_datetime.datetime(2008, 6, 23, 18, 33, 20, 525486,
        tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # UTC timestamps returned with UTC tzinfo
>>> fixed_datetime.datetime.utcnow()
fixed_datetime.datetime(2008, 6, 23, 15, 37, 44, 777729, tzinfo=<UTC>)

>>> # subtraction still works correctly!
>>> utcdt = fixed_datetime.datetime.utcnow()
>>> heldt = fixed_datetime.datetime.now()
>>> heldt - utcdt
datetime.timedelta(0, 5, 495702)

As a bonus, fixed_datetime.datetime contains methods to convert datetimes from ISO 8601 format. The method support parsing the time zone field, too:

>>> fixed_datetime.datetime.fromisoformat("20081010T010203+0500")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC+05:00>)

>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01:02:03Z")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3, tzinfo=<UTC>)

>>> # fractional hours, decimal comma, odd timezone
>>> fixed_datetime.datetime.fromisoformat("2008-10-10 01,0341666667-04:37")
fixed_datetime.datetime(2008, 10, 10, 1, 2, 3,
        tzinfo=<UTC-04:37>)

>>> fixed_datetime.datetime.today().isoformat(' ')
'2008-06-23 18:54:32+03:00'

>>> # isoformat supports short format, too
>>> fixed_datetime.datetime.now().isoformat(short=True)
'20080623T185303.489792+0300'

>>> # addition across DST boundary works as expected:
>>> before = fixed_datetime.datetime(2008, 10, 26, 2, 0, 0)
>>> before
fixed_datetime.datetime(2008, 10, 26, 2, 0, tzinfo=
        <DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

>>> # now, add 2 hours
>>> before + fixed_datetime.timedelta(seconds=7200)
fixed_datetime.datetime(2008, 10, 26, 3, 0, tzinfo=
        <DstTzInfo 'Europe/Helsinki' EET+2:00:00 STD>)

You can download the said module below.

Remaining issues

Not every remaining issue is solved. Fixed datetime still does not accept “24” as hour value (mandated by ISO standard), and will throw an exception on positive leap seconds. Fixed datetime is much slower than the python implementation – many of the operations need to create intermediate 2 or 3 datetime instances.

Sadly it seems that Java got it right: having one class (Date) that stores times in UTC seconds relative to Unix Epoch, and subclasses of abstract Calendar class that deal with getting and setting individual components and date arithmetic in a localized way would indeed be the best long-term solution. To some Java’s date and calendar handling would seem overly complicated, to me it is the simplest way of representing the complex world of different calendars, time zones and other aspects of time keeping. If only someone could persuade Python devs to add something similar to the standard library…

Download

Download fixed_datetime.py, released under 3-clause BSD license.

\"\" Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

Logging with LogMan

This is my first post on our company blog and I thought I’d tell you something about LogMan, which is developer’s utility for getting logging messages from Symbian device over a USB cable. It is written by me and mostly on my own time. I started the project because I had to do a Symbian excercise for university course and I thought I’d do something useful instead of quickly tinkering something small and easy.

LogMan supports both C++ and Python. With LogMan, you can send data to the same location from Python and C++ in real-time instead of using log file(s). Browsing through multiple log files can be tedious and you can’t see the debug output while using your application, because on Symbian you must read the file after the debugging session – Symbian cannot share opened files between applications. It is also possible that you create too much log and you run out of Phone internal memory. LogMan helps by removing the use of log files and you never run out of disk space because messages can be stored directly to PC. On simulator, the messages are also sent to RDebug (%TEMP%\epocwind.out). Surprisingly I have not seen a Python module, which would enable use of RDebug. Even though it is very easy to implement. With RDebug, there is no need to open a serial port on simulator for reading the logging messages.

Of course, I tried to use RDebug on device first, but I never got it working. I also tried REcmt, which is supported on S60 only and the service just kept on crashing on my phone. This is why I decided to write LogMan. Plus both are closed software, which effectively prevented me from fixing the problems.

Just wondering what kind of benefit Symbian or Nokia gets from keeping development tools such as these closed? What is there so secret about them? It didn’t take me very long to write the first working version. *sigh*

The use of LogMan is similar to RDebug. There are static class methods, which are a bit slower but easier, and instance methods. Check the project’s homepage for more examples.

#include "LogMan.h" //RLogMan RLogMan::Log( _L("Hello world ") );

I recently added a new feature for LogMan, which allows you to log stack and heap usage of the current thread with one function. When you are unsure about your heap or stack usage, these might come handy. Of course there is some memory used when calling these so take that into consideration. Python can access MemoryInfo only, which logs both stack and heap (Well, I got a bit lazy at that point). There are equivalent macros for these, so check them out from “logmanutils.h”.

// Store this as a member of your class, for example RLogMan logman;

logman.HeapInfo(); logman.StackInfo(); logman.MemoryInfo(); // Both stack and heap logman.Close();

The output from MemoryInfo is something like this: StackInfo Free:1039000, Used:9576, Size:1048576 HeapInfo Free:25856, Used:101004, Size:126860

1. Browser for PyS60

I have been trying, unsuccessfully, to get Browser Control working on PyS60. In a nutshell, CBrCtlInterface wrapper for Python. I have developed it against PyS60Community version in Launchpad. See /src/appui/appuifw/. I have used LogMan extensively to debug the extension so if you want a real example, check out “browsercontrol.cpp”.

Browser Control would allow one to embed a browser into his PyS60 application, which would be quite cool. No need to do user interfaces with “appuifw”, which is not very portable. With Browser Control, one could create his user interface with html and javascript, which are a lot more portable indeed. Less work leads to more time. And what is time?… it’s money. Or so I have heard. And being able to handle events with Python instead of C++ is another bonus.

Unfortunately, the API is not very stable as you can see by searching for “CBrCtlInterface” at Forum Nokia. The browser worked fine on the simulator with small pages such as “www.google.com”, but it crashed miserably with larger pages. The crash happens in browsercontrol.dll when calling e32.Ao_sleep() in Python. On device it was unable to open any page and crashed instantly when trying to load one. With LogMan, I was able to verify that the crash happened in e32.Ao_sleep() on device also. This reminds me to test on device all the time, which I didn’t do for the first versions. But this is why I added the memory logging feature to LogMan, but it only revealed that I was not out of stack or heap. At least not before the browser started to mess around.

I tried to compile the WebKit myself to see what is going on, but the build instructions didn’t work and the build scripts are written in Perl(my eyes started to hurt). What a mess. I don’t wonder anymore if there are bugs in browsercontrol.dll. I finally gave up because my idea pool dried up. Any help getting the wrapper working would be very much appreciated.

2. Plans for LogMan

I’m planning to add remote shell interface so that you could control your phone from PC. I want access to the file system first. Transfer files, list folders and such. If you have TCB rights ( or hacked phone ) you can speed up development remarkably by simply replacing your binaries in \sys\bin or your Python files with new versions. No need to install sis files and fiddle with certificates and play with memory cards. It would be so nice… I can use 1 day in a week for a personal project so this may happen in near future 😉

Plone, KSS, Javascript, field validation and the cup of WTF

Knowledge does hurt. Today I had my sweet cup of WTF. We are developing a medical database based on Plone 3.1. It uses very advanced AJAX framework called KSS – basically you can avoid the pain of writing pure Javascript by crafting special CSS like stylesheets which bind server-side Python code to any Javascript event. This makes AJAX programming a joy. You can easily combine server-side logic with user interface events, like field validation.

Well… then there was an error. KSS validation was not working for the text fields on a certain pages…. or it did sometimes. We were not sure. This is so called Heisenbug. I armed myself with sleepy eyes, Firebug and a lot of energy drinks.

I saw a KSS error in the Firebug log window and failed HTTP POST in the server logs.

Invalid request.

The parameter, value, was omitted from the request.

Looks like the field value was not properly posted for the field validation.

The first thing was locate the error and get function traceback for the messy situation. Unfortunately Firefox Javascript engine or Firebug cannot show tracebacks properly… the grass is so much greener on the other side of the fence. So I had to manually search through the codebase by manually plotting console.log() calls here and there.

Finally I thought I pinpointed the cause of the failure. By shaking finger (excitement, tireness and all that extra caffeinen from energy drinks), I opened the Javascript file just to realize why Javascript is utterly utterly shitty and why no sane person wants to do low level Javascript development. If ECMA standard committee had been clever and had been able to enforce anything long time ago, the following piece could be replaced with one function call.

fo.getValueOfFormElement = function(element) {
    // Returns the value of the form element / or null
    // First: update the field in case an editor is lurking
    // in the background
    this.fieldUpdateRegistry.doUpdate(element);
    if (element.disabled) {
        return null;
    }
    // Collect the data
    if (element.selectedIndex != undefined) {
        // handle single selects first
        if(!element.multiple) {
                if (element.selectedIndex < 0) {
                    value="";
                } else {
                    var option = element.options[element.selectedIndex];
                    // on FF and safari, option.value has the value
                    // on IE, option.text needs to be used
                    value = option.value || option.text;
                }
        // Now process selects with the multiple option set
        } else {
            var value = [];
            for(i=0; i<element.options.length; i++) {
                var option = element.options[i];
                if(option.selected) {
                    // on FF and safari, option.value has the value
                    // on IE, option.text needs to be used
                    value.push(option.value || option.text);
                }
            }
        }
    // Safari 3.0.3 no longer has "item", instead it works
    // with direct array access []. Although other browsers
    // seem to support this as well, we provide checking
    // in both ways. (No idea if item is still needed.)
    } else if (typeof element.length != 'undefined' &&
        ((typeof element[0] != 'undefined' &&
        element[0].type == "radio") ||
        (typeof element.item(0) != 'undefined' &&
        element.item(0).type == "radio"))) {
        // element really contains a list of input nodes,
        // in this case.
        var radioList = element;
        value = null;
        for (var i=0; i < radioList.length; i++) {
            var radio = radioList[i] || radioList.item(i);
            if (radio.checked) {
                value = radio.value;
            }
        }
    } else if (element.type == "radio" || element.type == "checkbox") {
        if (element.checked) {
           value = element.value;
        } else {
            value = null;
        }
    } else if ((element.tagName.toLowerCase() == 'textarea')
               || (element.tagName.toLowerCase() == 'input' &&
                    element.type != 'submit' && element.type != 'reset')
              ) {
        value = element.value;
    } else {
        value = null;
    }
    return value;
};

It turned out that the element in this case was an empty list of radio buttons. When you are tab keying through a radio button group without any value selected, like in the case a content object is just created, KSS validation is triggered even though there is no value in any of the radio buttons. This makes KSS think the value is null and it does not properly handle the situation. This does not cause any user visible effects unless you have Javascript debugging on (Firebug + debugging mode in Plone’s Javascript registry).

But this was not the bug I was looking for. It was just masking the original bug, because I had an empty radio button group next to the text field whose validation was not correctly done. More server side debugging…

I inserted some funky debug prints to Archetypes.Field.validate_validators():

validate_validators()
Calling validators:(('isEmptyNoError', V_SUFFICIENT), ('validDecRange03', V_REQUIRED))

We can see that not triggered validator, validDecRange03, is still with us. Then I add more debug prints to see where things go wrong, this time to to Products.validation.chain.__call__.

Calling validators:(('isEmptyNoError', V_SUFFICIENT), ('validDecRange03', V_REQUIRED))
Name:isEmptyNoError value:234234234234234234234234234232342342342344534232342344234534554 result:True

Ok – we have a case here. isEmptyNoError validator is executed before our custom validator. Since this validator is flagged as “sufficient” other validators are not evaluated. I think this has not been the case before and our validator have worked properly… maybe there was API change in Plone 3.1 which broke the things?

After digging and digging and digging I found this 4 years old bug. Let’s open the famous isEmptyNoError source code in Products.validation.validators.EmptyValidator:

class EmptyValidator:
    __implements__ = IValidator

    def __init__(self, name, title='', description='', showError=True):
        self.name = name
        self.title = title or name
        self.description = description
        self.showError = showError

    def __call__(self, value, *args, **kwargs):
        isEmpty  = kwargs.get('isEmpty', False)
        instance = kwargs.get('instance', None)
        field    = kwargs.get('field', None)

        # XXX: This is a temporary fix. Need to be fixed right for AT 2.0
        #      content_edit / BaseObject.processForm() calls
        #      widget.process_form a second time!
        if instance and field:
            widget  = field.widget
            request = getattr(instance, 'REQUEST', None)
            if request and request.form:
                form   = request.form
                result = widget.process_form(instance, field, form,
                                             empty_marker=_marker,
                                             emptyReturnsMarker=True)
                if result is _marker or result is None:
                    isEmpty = True

        if isEmpty:
            return True
        elif value == '' or value is None:
            return True
        else:
            if getattr(self, 'showError', False):
                return ("Validation failed(%(name)s): '%(value)s' is not empty." %
                       { 'name' : self.name, 'value': value})
            else:
                return False

There is my WTF. Or XXX – thanks for the kisses. My guess is that because KSS validation is executed in special context, the magical REQUEST might not be there. The “is sufficient” validator fails because of the some sort of god forgotten magic and thus all custom validators fail in KSS when the field is not required.

The workaround: I add my own greetings to the code:

        # XXX: This is a temporary fix. Need to be fixed right for AT 2.0
        #      content_edit / BaseObject.processForm() calls
        #      widget.process_form a second time!
        if instance and field:
            widget  = field.widget
            request = getattr(instance, 'REQUEST', None)

            # XXX: Whatever this all does, it does not work for KSS validation
            # requests and we should ignore this
            if "fieldname" in request:
              return False

            if request and request.form:
                form   = request.form
                result = widget.process_form(instance, field, form,
                                             empty_marker=_marker,
                                             emptyReturnsMarker=True)
                if result is _marker or result is None:
                    isEmpty = True

If Zope 3 drives you smoking Plone 3 drives me drinking. No wonder newbies steer away from Plone – if you hadn’t been armed with 8 years of web development experience you would never have figured out what’s going on with such a simple thing as adding a custom validator. A comment added to the bug tracker.