[Biopython-dev] SVN migration and Launchpad mirroring

Bruce Southey bsouthey at gmail.com
Mon Feb 9 11:04:09 EST 2009


Chris Lasher wrote:
> On Sun, Feb 8, 2009 at 2:03 PM, Bartek Wilczynski
> <bartek at rezolwenta.eu.org> wrote:
>   
>> On Sun, Feb 8, 2009 at 5:47 PM, Giovanni Marco Dall'Olio
>> <dalloliogm at gmail.com> wrote:
>>
>>     
>>> I like github and I think its web interface is one of the best to work
>>> with git: it has some tools that I didn't see in the other hosting
>>> services supporting git (trac, gitorious), especially those for
>>> creating forks.
>>>
>>> The problem is that the basic account on github is limited to 100 MB,
>>> and with the peculiar approach adopted by git (distributed source
>>> control) anyone wishing to participate code to biopython should have
>>> to create an account on github and in theory create a copy of the
>>> repository in his space.
>>>
>>> Moreover, I think it would be more difficult to use git without the
>>> tools offered by github, even if we configure a git repository with
>>> trac or similar on the openbio's servers. I don't know if the git-trac
>>> plugins has a feature to show all the forks like the one in github.
>>> Maybe I am just wrong.. but you should ask to the bioruby people how
>>> they are comfortable with these issues, since they are more expert.
>>>
>>>
>>>       
>> Have you tried to use bazaar+launchpad? It's really easy and should do
>> all the tricks you need from a distributed vcs. It  also has features for
>> bugtracking (like trac on github) but i dont' know if we are unhappy with
>> current setup (bugzilla). I think bzr+launchpad has a number of advantages
>> over git+github:
>> -> can work with CVS as a master repository which means that the
>> transition would
>> not require going through SVN (although if it would help people from
>> OBF it is also possible).
>> -> Anyone used to cvs commands (commit, diff, update etc..) can use bzr without
>> trouble. You only need to know new "distributed" commands (push,branch)
>> -> it supports centralized decisions on merging: the possible scenario
>> is that only a
>> limited number of people can merge to the main repository (push in bzr
>> terminology)
>>     
>
> This is a good discussion. The longer BioPython has taken to move to
> SVN and the more I've worked with distributed revision control
> systems, the more inclined I am to say that moving from CVS to SVN is
> a waste of time. The advantages of DSCMs and the tools that have
> emerged around them (GitHub, Launchpad, Bitbucket, etc.) are too great
> to ignore; at some point in BioPython's path, it will move over to one
> of these tools. So why not skip to the current generation of SCM?
>   
Do you control your own project with multiple developers?
If so, how do you ensure which is the standard version and address 
conflicts?

While I understand the advantages of distributed option, I do not see 
the end result any different between a distributed and a non-distributed 
version control system. Even in Linux, the only 'tree' that counts is 
Linus's as he provides the official versions of the kernel. I would 
argue that same applies to Biopython especially as there appears to be 
single developers providing their own material to the single tree rather 
than multiple developers working together. Part of that is legacy in 
that the core bioinformatics in Biopython is rather complete.

> I'm most a fan of Bazaar VCS, especially given its great integration
> with Launchpad. If BioPython were to move to hosting its bugs on
> Launchpad (I believe importing from Bugzilla is possible), I think the
> benefit becomes significantly greater, due to the great ability to
> automatically associate branches/commits with bugs.
I don't find automatic association between fixes and bugs a reason to 
change. In numpy's Trac system you can see which version where the bug 
was closed.

>  If BioPython
> chooses to stick with Bugzilla, that feature wouldn't be as useful. (I
> think the same could be said for using the GitHub + Lighthouse
> combination.)
>
> On that note, I do recommend making sure that the BioPython project
> moves the code to one of these "social coding" sites (e.g., GitHub,
> Launchpad, Bitbucket). They bring the "who's working on what" that's
> necessary for tracking the project as a whole.
>
> Finally, none of this is really technically challenging, just socially
> challenging: we have to find a consensus and then actually follow
> through and make the move. It's 2009; we need to say goodbye to CVS,
> acknowledge that we missed our time with SVN, and just go straight to
> a DSCM and a modern code tracking site.
>
>   
I think that central question that is lacking so far is how will any of 
these approaches work with what Biopython is, how Biopython operates and 
what Biopython provides?
It is very easy to argue in general terms on how one system is better 
than another - lots of web pages on that. But that does not address the 
needs of the project as a whole. At present, you and others have not 
specifically addressed how Biopython would benefit from this.

How do you maintain a stable tree that always should be correct and 
addresses conflicts (like different coding style and semantics :-) )?
As with Linux, people do not scale, so the one of the main goals of any 
system is that it should minimize effort of maintaining and producing 
the stable release.

How does a user get the 'latest' version if they have bug? How do you 
even know what version that actually have?
How do they avoid picking up other changes that a developer has made in 
addition to that bug fix? (Not that any system is immune, like 
developers adding unsupported dependencies or undefined variables as in 
recent cases in numpy).

I also favor the centralized system because I am not a Biopython 
developer but a tester. So getting the current version is essential to 
do that and I do not want to have to pull other people's code in to do 
that especially if it brings in new code not related to a fix.  Nor do I 
think that an extended period of pre-release testing is suitable for 
Biopython.

Just some thoughts,
Bruce


More information about the Biopython-dev mailing list