Skip to content

Lets build Bittorent support for Cheese Shop

June 10, 2009

Edit: The reception was less than overwhelmingly enthusiastic, and for me to push a community project that is not absolutely necessary for me, I need overwhelmingly enthusiastic. So I’m not going to push this. I still think i could be a good solution, though.

There are two problems that can be killed with one piece of software in the world.

One of the problems is that the Python Package Index (aka the Cheese Shop) is a single point of failure when distributing python software. Well, in fact, because some packages doesn’t reside there, but are only indexed there, it’s multipole single points of failure. There is a whole bunch of servers that currently needs tobe running for you to install Plone from a buildout, for example.

A second problem is that bittorent, although being a great idea and useful for all sorts of distribution of data, is currently used pretty much only for file sharing of copyrighted materials. This means that this great protocol often gets filtered by ISPs and similar. We need a legitimate use for bittorrent.

And as you of course quickly realize, bittorent is in fact a solution to the first problem! And not only that, it also provides an answer to some other problems. Like setting up a local egg cache for a company.

So, I will start a project to do this. But before I start, I want to check if more people have had this idea. Good ideas are typically somebody elses ideas, and in this case I know that Matthew Wilkes already started on this earlier. So if you already have some half-finished code for this, or know somebody that has, give us a shout, and lets merge the projects! Also, if you want to help, give us a shout too.

About these ads

From → plone, python, zope

17 Comments
  1. Martijn Faassen permalink

    Would PyPI itself function as a seeding peer always? How does it become such a seeding peer if the actual file is not on PyPI itself?

    If PyPI is not actually seeding the off-PyPI file, what if nobody else is seeding it either?

  2. Yeah, that’s a good question, I’ve been pondering this, and I think that in fact we can probably solve that by not having PyPI being a seeder, but having the network separate from PyPI, but falling back to using http of the file doesn’t exist on the network. And then of course seeding it afterwards.

  3. While bittorrent is cool and all (the only protocol with EU parliamentarians…) isn’t it the wrong solution for this? After all eggs tend to be rather small… I think we need to leverage a Content Delivery Network / fix setuptools would add more value.

    For installing Plone with buildout, everything is mirrored on http://dist.plone.org/release/X.Y – when I need to install behind a firewall I just mirror this to a locally available apache.

  4. Sven Deichmann permalink

    Coolest solution would be a plugin system to include arbitrary content delivery solutions. Like BITS for Windows machines which is a very cool solution too, since it uses only spare network performance. Or even Amazon S3 or stuff like that.
    And above that we need web mirrors for python.org docs ;)

    • That is true. I was thinking plugins for different bittorrent backends, but of course that would work for anything.

  5. I take the problem in another way: accept more than on index and have pypi mirrors around the world. Projects like egg_proxy can also give a solution for companies that have an intensive usage of pypi.

  6. Yes, but all these problems have apparently already been solved by bittorrent trackers. Why solve them again?

  7. That’s a great idea. Bittorrent offers web seeds support which is a perfect fit for cheese shop.

    It would also be a tiny step to move away from global pypi repository since you could easily distribute eggs without having to use any pypi server. Tracker and here-we-go. If distutils would support the protocol, it would mean a big leap for Python.

    What’s the current problem of Cheese Shop? That it’s a closed protocol. Not closed in a legal matter but in a matter of usage.

  8. Bittorrent for PyPI – YAGNI. A reliable mirroring infrastructure is what we need.

  9. Yeah I like to see bittorrent Pypi for it’s a well tested technollogy requiring far less human interventions than mirroring, it will make pypi more reliable. And if there are many peers it’s even fasrer than direct downloads.

  10. While the idea is interesting, I think a good mirroring infrastructure ala CPAN is the priority.

    We worked on PEP 382 at Pycon, and the implementation on PyPI server itself is half-way done

    http://www.python.org/dev/peps/pep-0381/

  11. So, if we take buildout as an example of using cheeseshop, how would bittorrent know that it gets a legit egg? AFACT bittorrent uses some kind of index file, is that where the SHA1 hashes are defined? In that case the index file would be the single point of failure?

  12. Alex: Well, the files are typically quite small, so I wouldn’t expect any speedup.

    Jörgen: You have a tracker that keeps track of clients and files. There are trackers that support multiple distributed trackers to be failsafe, although I don’t know the details.

  13. Lennart: What serves the torrent file?

    http://en.wikipedia.org/wiki/BitTorrent_(protocol)#Creating_and_publishing_torrents

    ‘Torrent files have an “announce” section, which specifies the URL of the tracker, and an “info” section, containing (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, all of which are used by clients to verify the integrity of the data they receive.’

    This seems to be the file that can verify your download, and should not be a single point of failure.

  14. I imagined having a fallback to http if the file doens’t exist in the trackers, getting it from the place that PypI sais that it is, and then starting to seed it automatically once downloaded.

  15. OK, the amount of people thinking this was a bad idea was rather large. I have no interest in pushing a community project if many people feel it’s a bad idea, so I’m not going to push this further.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,302 other followers

%d bloggers like this: