Skip to content

Use key when sorting, not cmp

September 28, 2009

I just read Jarret Hardies article in Python Magazine on How to Survive Sorting in Python 3. There it is claimed that sort() and sorted() takes both cmp and key as parameters (which is correct) but that cmp is easier to understand and more popular.

But is it easier to understand? cmp requires you to compare two objects. It requires you to first extract a value from the two objects that can be compared, then compare these values and then return -1 if the first object is smaller than the object passed in, 0 if they are equal, and +1 if the first object passed in to the comparison function is larger than the second. Or, if it’s multiple values that should be compared, you typically compare them one by one.

OK, admittedly, that is not very hard to understand. But what about using key? Well, for key you pass in a method that takes only one parameter, the object. The method then extracts a value that is used by the sorting. Done. It honestly doesn’t get simpler or easier to understand than that. It’s also faster, as the sorting methods will call the key function on each object only once.

The reason Python programmers are still using cmp instead of key (and also list.sort() instead of sorted()) is that these didn’t exist before Python 2.4, and therefore people use the old methods out of habit. No other reason. I didn’t even know about key until a year ago when I started looking into Python 3.

There is one usecase where key has one minor issue of “uhm…”, and that is if there is no obvious way of extracting one value. Maybe you should sort fruit so you want it sorted by type of fruit, and then size. So that’s two values, but key requires you to generate just one value. Well, you can simply return a tuple of the values, in the order they should be sorted. So for the fruit case (hehe), you simply do “return fruit.type, fruit.size”. It’s that simple, again.

Jarret also mentions a use case that key doesn’t cover, namely if you want to sort one value ascending and another descending. Well, it is possible to do, by generating some sort of hash that takes this into consideration, but that’s silly. Instead Jarret recommends doing tow consecutive sorts. He also demonstrates this to be faster than a cmp based comparison, so there you go.

Key is easier to implement and faster to run. You should already now use key instead of cmp. And you should buy Python Magazine, it’s getting really good.

(Also, most of the time you should use sorted() instead of sort. Because sorted can sort anything, while list.sort() of course only works on lists. That way people can pass in generators to your methods where you expected lists, and use the dynamicism of Python even more!)


From → python, python3

  1. Lennart, Great article… this exact thing came up in conversation in the office the other day, when I saw some code by a newer python programmer who used key:

    I never knew why I’d never seen it before, but like you said if it only came in python 2.4 that could be why I never noticed it.

    And I didn’t realise the subtlety that sorted() can take any iterator not just a list. I still use list.sort() out of habit…. time to move on I think.


  2. just be careful with sorting unicode strings … i wrote how you do it here

  3. Instead of using TWO sorts, you can apply minus sign within a tuple, as in below example.

    class MC:
    def __init__(self,v,v2):


    #L2.sort(key=lambda x: x.val)
    L2.sort(key=lambda x: (x.val,-x.val2))

    for o in L2:
    print o.val,”,”,o.val2

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: