Skip to content

Please allow u” in Python 3.0!

March 19, 2008

I’m putting together a small suite of tests of things that break in Python 3, and ways around it. Specifically, I’m trying to see if it is possible to write code that runs both in Python 2.6 and Python 3.0. The answer so far is a resounding No! But this is because only one thing: u” isn’t recognized in Python 3.

A quick explanation of the problem: In Python 2.5, there are two “stringy” types, str and unicode. The first one used for 8-bit strings and 8-bit binary data, the second for unicode strings. You differ between the two by marking the second type with a u in front, hence a unicode string is u’string’.

In Python 3, all strings are unicode. Hence, there is no need for the u” syntax to separate unicode strings from normal strings. Instead it has a byte-type, used for 8-bit binary data. These are separated by having a b in front of the byte type, b’data’.

Python 2.6 allows b” syntax as forwards compatibility. b’data’ is jst the same as just writing ‘data’ in Python 2.6. So in 2.6, both ‘string’, b’data’ and u’unicode’ is valid code. However, in Python 3, u” fails, which means that any 2.6 code that uses unicode will fail under 3.0.

Should we care? I mean, we are not supposed to write code that runs under both 2.6 and 3.0. Instead we are supposed to write code that works under 2.6 and the convert it to 3.0. Yeah, we are supposed to. This is correct. But I’ve been doing some tests, and the fact is, most straight python-code will run under both 2.6 and 3.0. The major hurdle is that the lack of u” support in 3.0 means that you can’t use unicode.

OK, it’s possible to get around. You can do this trick:

        u = unicode
    except NameError:
        u = str
    text = u("This is unicode")

This runs under both 2.x and 3.0. But it’s pretty ugly. And this way it’s impossible to have anything else than ascii in the string, as the line u(“Här har vi unicode”) will in 2.6 attempt to convert the text-string “Här har vi unicode” to unicode with the ascii-encoding, which will fail.

Result: It’s practically impossible to get an application that needs unicode to run under both 2.6 and 3.0 unless you do the above ugly trick everywhere. And this is in fact pretty much the only hurdle. Others, as the print statement, turns out to not be a problem. print(“Hey, this works like %s” % something) works fine under both. The new “as” syntax when catching exceptions is supported in 2.6, and so on.

The things that does need a bit of special code that I have found so far is imports of renamed modules (like StringIO), and special casing of iterkeys(), xrange() if you really need them, and so on. But so far, the code that I typically write would be easily adaptable to run under both 2.6 and 3.0, except for the fact that I use unicode a lot, and that won’t work.

So, PLEASE, allow the u’text’ syntax in Python 3.0. If you do, all my compatibility worries are gone. You can get rid of it in 3.1 if you must.


From → python, python 3000

  1. If you really want this, bring it up on That’s where you reach the developers; few of us don’t really read blogs…

  2. I will, thanks, haven’t had time to join, yet. 🙂

  3. njharman permalink


    It would be better to have the 2.x to 3000 code munger thingy remove the ‘u’. If it doesn’t already.

  4. No it wouldn’t, it already does, and it’s besides the point.

    This is about running the same code under 2.6 and 3.0. Maintaining and releasing two different releases of the software is a pain that many people want to avoid. It also makes it practically impossible to develop an application using Python 3, if your modules are written for 2.x and you need to use trunk.

    Quite a lot of code is developed out of “necessity” and not from principles or feature lists. Supporting 2.6 and 3.0 is going to be impossible for that code if you need to run a conversion tool every time you made a change. The result is going to be that 3.0 never gets supported for this code, and that major python users such as Plone will never use 3.0.

  5. Jim Bardin permalink

    I’m not sure if this will still be feasible once the stdlib is fully re-factored.

    You would need lots of try/except imports to figure out where your libraries are coming from, and what they’re called. Note that they’re not just removing and renaming things. Many are being combined, and the hierarchy is changing for others.

  6. Well, I think that’s easily fixable by having a “future_extras” module that on import modifies sys.modules to also allow the new names (I haven’t tried it yet, though).

    It’s probably not going to catch 100% of it, but it doesn’t need to. It also needs to catch enough to work in 95% of the cases. In the pesky 5% extra of modules that doesn’t work, well, then you have to do parallell development with 2to3.

  7. Oh, btw. An alternative and better solution was suggested on the python-3000 list: from __future__ import unicode_literals, that would for the module in question mean that “text” was unicode, and not strings.

    I like this better, as it doesn’t change Python 3.0.

  8. Eljay permalink

    Python 3000 is intended to break 2.x compatibility. This is exactly one of the kinds of things that will — and should — be incompatible, because the Unicode string support in 2.x is a kluge, and one of 3000’s goals is to dump the cruft and move forward afresh.

  9. As mentioned in the comments, this has been fixed in 2.6a3.

Trackbacks & Pingbacks

  1. Why Python 2.6 and 3.0 compatibility would be a Very Good Thing. « Lennart Regebro: Plone consulting
  2. New project: python-incompatibility. Is Python 3.0 really incompatible? « Lennart Regebro: Plone consulting

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: