Python and time zones part 2: The beast returns!
Updated: I’ve added step 3 ½ and 3 ¾.
Updated: More on MS Windows.
In my previous post on Python and time zones. I explained five problems with time zones in general and Python support for them in particular, and how I succeeded in solving four of them, and ignoring the fifth. But I also mentioned a sixth potential problem: Windows. Or more generally speaking, cross-platform support.
I’ve now tackled this head-on, in another two-day marathon of fiddling and testing. The result of that is a small program (a mere 365 lines of code at the moment) that will tell you what time zone your computer is in. Yeah. Really. Amazing, isn’t it? The program needs more testing, it only takes a minute or two, so please help me test it.
Why is it so big, you ask? Getting a timezone in Python is easy, right? from time import tzname, right? Wrong!
Unix and Mac OS X
Under unix and OS X tzname will not tell you what you need, because it will give you an abbreviation, like “EST”, which unfortunately can be any of three different time zones. And many time zones will have the same abbreviation even though they have different daylight saving times. What you really want is the zoneinfo database name. Like Europe/Paris, or US/Eastern. Not CET or EST. So, how can you figure it out? Well, it’s a complex multi-step story.
Step 1. Some machines have a TZ environment variable set up. It can be rather complex, and you can define up much of the time zone info, light daylight saving and such in that variable. If you do, there is no way we can figure out where you are, so we can stop there. But more commonly, the TZ variable should contain the name of a file that specifies the time zone. This specification can either be a absolute path to the file, or a name relative to the root location of zone info files. So it should either be Asia/Dubai or /usr/share/zoneinfo/Asia/Dubai. If it’s a relative path we can just verify that it is a valid time zone name, and then we are done. If it’s a absolute path, we need to pick away one part at a time, until we get a valid time zone name.
Step 2. If the TZ variable is not found, then the default timezone file is /etc/localtime. OS X and some unices like Ubuntu 7.10 will create /etc/locatime by creating a symlink from the selected timezone to /etc/localtime. So all we then need to do is to do is to follow /etc/localtime to the real filename. It will then be something like /usr/share/zoneinfo/Australia/Canberra, and all we then need to do is to chop off one directory name at a time, until we have a valid time zone name.
However, not all unices symlinks like this. Ubuntu 8.04 doesn’t, and CentOS doesn’t. Instead they copy in the zoneinfo file to /etc/localtime when you change the timezone, so we can’t follow /etc/localtime to the real file.
Step 3. But some unices are nice enough to store the time zone in /etc/timezone, so then we can look it up there. This is true for Ubuntu, and Solaris for example. It is in this case a relative specification, ie, it’s stored as Africa/Cairo, so we can return that.
Step 3 ½. Gentoo has a file /etc/conf.d/clock which has the entry TIMEZONE with the timezone, so we can use that under Gentoo.
Step 3 ¾. CentOS has a file /etc/sysconfig/clock which has the entry ZONE with the timezone which we can use. OpenSUSE has the same file, but calls the entry TIMEZONE.
Step 4. Some unices, most notably CentOS, will not symlink /etc/localtime, and will not have a /etc/timezone. So all efforts so far to figure out the time zone will fail. But there is one ugly last hack we can do. We can simply compare /etc/localtime with all the files in /usr/share/zoneinfo until we find the correct one! :-) Yes, ugly I know, but it works! We need to skip the files in the SystemV directory and the file called localtime, which is a symlink to /etc/localtime, but then it seems to work fine.
But of course, not all unices have a /usr/share/zoneinfo. Admittedly, most do. Also, you can in theory have a modified version of the zoneinfo file, or you can have a TZ variable that is really strange, or the computer could simply have it’s configuration messed up. We need a backup plan, in those cases all else fails:
Step 5. Revert back to using time.tzname. I know, it’s not reliable, but it works in most cases. And what if it doesn’t? Well, then you just have to bloody well configure your computer correctly!
Under Windows, everything is completely different. Well, there is one thing in common: time.tzname is useless. This is of course mostly Microsofts fault. This is how it works on Windows.
Step 1. The time zone info of your computer is stored in the registry. Now, I like the registry, I think it is a perfectly good idea to store all configurations in one place. Really. Just remember to back it up. It’s also easy to edit, and you can search in it. That’s So much better than having 200 configuration files spread out over the hard disk. However, the API to the registry is done by somebody with only half a brain. First you open a connection to the registry. All is well. Then you open the key you want. Fine. Then you want to know what subkeys there are. Well, you can’t. No, you have to ask the key for how many subkeys it has, and then loop through them, asking the id of each subkey. The same thing goes for values. It’s really stupid. So stupid in fact that the standard Python module is called _winreg, because it only wraps this API, and it is so stupid that the Python people refuse to call the module winreg, reserving that name for a module that has a useble API. It doesn’t exist yet, though. Yes, this means that you can’t just get the value “something”. You have to loop through all values until you find the one called “something”. Amazing.
However, the person who designed this, either was not alone in the stupidity, or he was one of the top designers at Microsoft, because the time zone support is just as moronic. The registry keeps track of loads of information about your selected time zone, but for some reason not the name of the time zone! How stupid can people get? Even the complete morons who designed the file format of zoneinfo files understood that you need the name of the time zone. They just didn’t realize the names should be unique. Instead Microsoft stores the the name of the time zone as displayed in the dialog box where you change time zones. Yes, really, as displayed. They store not the name of the time zone, but the title. This is amazingly stupid in itself. But it doesn’t end there. Yes, you guessed it, that value will be translated into the language of the installed Windows. So, when you look up what time zone you are in in the registry, you will get different results for the same time zone if you run a English Windows and if you run a French Windows. Fantastic, isn’t it? The level of stupidity involved here is truly something extraordinary.
Step 2. Of course, Pythons time.tzname will return these display values. This makes time.tzname completely useless. So to figure out which time zone is selected, you have to loop though all the time zones that exists (in another part of the registry) and compare the display values with the ones used. Yes, really, you have to loop through. Because, as mentioned, the display values are internationalized, so you can’t use a fixed lookup table. But anyway, now you have found the time zone. All is well? Of course not.
Step 3. Whoever designed the Windows time zone implementation was not only a moron, he was also on drugs. Because the time zone names does not have any standard or even reasonable naming scheme. Forget the wonderful “Europe/Stockholm”. Of course they don’t use “EST” or “Eastern Standard Time”, that wouldn’t work, the names must be unique as they are keys on the registry. No, instead they have their own scheme. “US Eastern Standard Time” is at least resonably intelligent, but “Arabic Standard Time”? Are Arabs only allowed one? Why does then the United Arab Emirates have GMT+4, while Saudi Arabia has GMT+3? “E. Europe Standard Time”? Didn’t “Eastern” fit? “E. South America Standard Time”. Eastern South America!? Yeah, that’s a well recognized geographical and politcial entity… It must be, because there is a “SA Eastern Standard Time”, i e South America Eastern Standard Time as well. The difference? Well, who knows. And of course, the favourite: “Romance Standard Time”. Named after what? Rome? The Romance in France? The Scandinavian romanticist movement? No, the people that cam up with this must have been on drugs, and pretty heavy ones too.
Any way, the solution to this was to write a script that loads a file with conversion data from unicode.com, which keeps an up to date version (I hope) and convert that to a mapping between Microsofts names to zoneinfo database names. Of course, these names are different in Windows 95 and Windows XP, so Windows 95 will not work. I make some wild guesses on Windows 95, but generally I just ignore it. I also have no idea how translation of timezones works on Windows 95, as I only have the english version. It looks like (but I’m not sure) that they don’t translate the values in Windows 95. It would be fun to confirm that, but Windows 95 support is a rather low priority…
So there. Now we know what time zone on Windows (at least XP), OS X and most unices. That was easy, wasn’t it? Yeah, maybe not.