Please allow u” in Python 3.0!
I’m putting together a small suite of tests of things that break in Python 3, and ways around it. Specifically, I’m trying to see if it is possible to write code that runs both in Python 2.6 and Python 3.0. The answer so far is a resounding No! But this is because only one thing: u” isn’t recognized in Python 3.
A quick explanation of the problem: In Python 2.5, there are two “stringy” types, str and unicode. The first one used for 8-bit strings and 8-bit binary data, the second for unicode strings. You differ between the two by marking the second type with a u in front, hence a unicode string is u’string’.
In Python 3, all strings are unicode. Hence, there is no need for the u” syntax to separate unicode strings from normal strings. Instead it has a byte-type, used for 8-bit binary data. These are separated by having a b in front of the byte type, b’data’.
Python 2.6 allows b” syntax as forwards compatibility. b’data’ is jst the same as just writing ‘data’ in Python 2.6. So in 2.6, both ‘string’, b’data’ and u’unicode’ is valid code. However, in Python 3, u” fails, which means that any 2.6 code that uses unicode will fail under 3.0.
Should we care? I mean, we are not supposed to write code that runs under both 2.6 and 3.0. Instead we are supposed to write code that works under 2.6 and the convert it to 3.0. Yeah, we are supposed to. This is correct. But I’ve been doing some tests, and the fact is, most straight python-code will run under both 2.6 and 3.0. The major hurdle is that the lack of u” support in 3.0 means that you can’t use unicode.
OK, it’s possible to get around. You can do this trick:
try: u = unicode except NameError: u = str text = u("This is unicode")
This runs under both 2.x and 3.0. But it’s pretty ugly. And this way it’s impossible to have anything else than ascii in the string, as the line u(“Här har vi unicode”) will in 2.6 attempt to convert the text-string “Här har vi unicode” to unicode with the ascii-encoding, which will fail.
Result: It’s practically impossible to get an application that needs unicode to run under both 2.6 and 3.0 unless you do the above ugly trick everywhere. And this is in fact pretty much the only hurdle. Others, as the print statement, turns out to not be a problem. print(“Hey, this works like %s” % something) works fine under both. The new “as” syntax when catching exceptions is supported in 2.6, and so on.
The things that does need a bit of special code that I have found so far is imports of renamed modules (like StringIO), and special casing of iterkeys(), xrange() if you really need them, and so on. But so far, the code that I typically write would be easily adaptable to run under both 2.6 and 3.0, except for the fact that I use unicode a lot, and that won’t work.
So, PLEASE, allow the u’text’ syntax in Python 3.0. If you do, all my compatibility worries are gone. You can get rid of it in 3.1 if you must.