[Biopython-dev] SVN migration and Launchpad mirroring

Bartek Wilczynski bartek at rezolwenta.eu.org
Mon Feb 9 16:24:59 UTC 2009


On Mon, Feb 9, 2009 at 5:04 PM, Bruce Southey <bsouthey at gmail.com> wrote:

>> This is a good discussion. The longer BioPython has taken to move to
>> SVN and the more I've worked with distributed revision control
>> systems, the more inclined I am to say that moving from CVS to SVN is
>> a waste of time. The advantages of DSCMs and the tools that have
>> emerged around them (GitHub, Launchpad, Bitbucket, etc.) are too great
>> to ignore; at some point in BioPython's path, it will move over to one
>> of these tools. So why not skip to the current generation of SCM?
> Do you control your own project with multiple developers?
> If so, how do you ensure which is the standard version and address
> conflicts?
> While I understand the advantages of distributed option, I do not see the
> end result any different between a distributed and a non-distributed version
> control system. Even in Linux, the only 'tree' that counts is Linus's as he
> provides the official versions of the kernel. I would argue that same
> applies to Biopython especially as there appears to be single developers
> providing their own material to the single tree rather than multiple
> developers working together. Part of that is legacy in that the core
> bioinformatics in Biopython is rather complete.

That's the point. Linux is a perfect example how a large project can
benefit from
using a distibuted vcs. The official branch is the one which is linked from
biopython.org website. But anyone can _easily_ branch it on his/her own, make
changes to it and send submit it for merge with the trunk or just
publish it so people
can use his branch.

>> I'm most a fan of Bazaar VCS, especially given its great integration
>> with Launchpad. If BioPython were to move to hosting its bugs on
>> Launchpad (I believe importing from Bugzilla is possible), I think the
>> benefit becomes significantly greater, due to the great ability to
>> automatically associate branches/commits with bugs.
> I don't find automatic association between fixes and bugs a reason to
> change. In numpy's Trac system you can see which version where the bug was
> closed.

I think that using launchpad for bugtracking is a separate issue.
There are different options here.
The good thing about launchpad+bzr is that it allows this, so it won't
be a problem if we decide to
switch from bugzilla to somehing else. But it is a separate decision.

>> Finally, none of this is really technically challenging, just socially
>> challenging: we have to find a consensus and then actually follow
>> through and make the move. It's 2009; we need to say goodbye to CVS,
>> acknowledge that we missed our time with SVN, and just go straight to
>> a DSCM and a modern code tracking site.
> I think that central question that is lacking so far is how will any of
> these approaches work with what Biopython is, how Biopython operates and
> what Biopython provides?
> It is very easy to argue in general terms on how one system is better than
> another - lots of web pages on that. But that does not address the needs of
> the project as a whole. At present, you and others have not specifically
> addressed how Biopython would benefit from this.
> How do you maintain a stable tree that always should be correct and
> addresses conflicts (like different coding style and semantics :-) )?
> As with Linux, people do not scale, so the one of the main goals of any
> system is that it should minimize effort of maintaining and producing the
> stable release.
> How does a user get the 'latest' version if they have bug? How do you even
> know what version that actually have?
> How do they avoid picking up other changes that a developer has made in
> addition to that bug fix? (Not that any system is immune, like developers
> adding unsupported dependencies or undefined variables as in recent cases in
> numpy).
> I also favor the centralized system because I am not a Biopython developer
> but a tester. So getting the current version is essential to do that and I
> do not want to have to pull other people's code in to do that especially if
> it brings in new code not related to a fix.  Nor do I think that an extended
> period of pre-release testing is suitable for Biopython.

Absolutely right. I think there is a misconception about the
"distributed" part of git or bzr.
I don't think anybody was proposing some guerilla style development
with no official releases and
code-base. Using dvcs is for enabling people to contribute effectively
rather than because it
centralized development easier.

The key thing here that bzr/launchpad (or git+github, but I'll stick
to what I know for sake of this example)
does not _need_ to be the main repository for biopython.
I think that possible advantages are not so much  in using it
internally, but making it easier for
people to branch and merge. Having an "official" bzr branch of
biopython which is automatically
updated from current main vcs (currently CVS) makes branching as easy
as writing:
bzr branch lp:biopython

After someone has made a number of changes (and commits to his local
vcs) and is happy with
the result you just do

bzr send lp:biopython

and the maintainer of the branch gets notified about a submission of a
patch. Then he can decide to
merge it into trunk (without loosing any changes history) or refuse.
Once the changes are merged into
the official bzr branch it's easy to commit them back to CVS.

After a while, if people are happy with using bzr instead ov cvs, weo
could switch to bzr to avoid synchronizing
with CVS, but this is not necessary.

It's all about making it easier for people to get involved. Currently
the only possibility to participate is to send
patches through bugzilla or mailing list but merging this into a cvs
is a nightmare. While in bzr (or git) you can
develop "on a branch" locally, without disturbing anyone, and then
merge with trunk without loosing your
development history (virtually impossible in cvs or svn)

Bartek Wilczynski
Postdoctoral fellow
EMBL, Furlong group
Meyerhoffstrasse 1,
69012 Heidelberg,
tel: +49 6221 387 8433

More information about the Biopython-dev mailing list