[Biopython-dev] SVN migration and Launchpad mirroring
Bruce Southey
bsouthey at gmail.com
Mon Feb 9 11:04:09 EST 2009
Chris Lasher wrote:
> On Sun, Feb 8, 2009 at 2:03 PM, Bartek Wilczynski
> <bartek at rezolwenta.eu.org> wrote:
>
>> On Sun, Feb 8, 2009 at 5:47 PM, Giovanni Marco Dall'Olio
>> <dalloliogm at gmail.com> wrote:
>>
>>
>>> I like github and I think its web interface is one of the best to work
>>> with git: it has some tools that I didn't see in the other hosting
>>> services supporting git (trac, gitorious), especially those for
>>> creating forks.
>>>
>>> The problem is that the basic account on github is limited to 100 MB,
>>> and with the peculiar approach adopted by git (distributed source
>>> control) anyone wishing to participate code to biopython should have
>>> to create an account on github and in theory create a copy of the
>>> repository in his space.
>>>
>>> Moreover, I think it would be more difficult to use git without the
>>> tools offered by github, even if we configure a git repository with
>>> trac or similar on the openbio's servers. I don't know if the git-trac
>>> plugins has a feature to show all the forks like the one in github.
>>> Maybe I am just wrong.. but you should ask to the bioruby people how
>>> they are comfortable with these issues, since they are more expert.
>>>
>>>
>>>
>> Have you tried to use bazaar+launchpad? It's really easy and should do
>> all the tricks you need from a distributed vcs. It also has features for
>> bugtracking (like trac on github) but i dont' know if we are unhappy with
>> current setup (bugzilla). I think bzr+launchpad has a number of advantages
>> over git+github:
>> -> can work with CVS as a master repository which means that the
>> transition would
>> not require going through SVN (although if it would help people from
>> OBF it is also possible).
>> -> Anyone used to cvs commands (commit, diff, update etc..) can use bzr without
>> trouble. You only need to know new "distributed" commands (push,branch)
>> -> it supports centralized decisions on merging: the possible scenario
>> is that only a
>> limited number of people can merge to the main repository (push in bzr
>> terminology)
>>
>
> This is a good discussion. The longer BioPython has taken to move to
> SVN and the more I've worked with distributed revision control
> systems, the more inclined I am to say that moving from CVS to SVN is
> a waste of time. The advantages of DSCMs and the tools that have
> emerged around them (GitHub, Launchpad, Bitbucket, etc.) are too great
> to ignore; at some point in BioPython's path, it will move over to one
> of these tools. So why not skip to the current generation of SCM?
>
Do you control your own project with multiple developers?
If so, how do you ensure which is the standard version and address
conflicts?
While I understand the advantages of distributed option, I do not see
the end result any different between a distributed and a non-distributed
version control system. Even in Linux, the only 'tree' that counts is
Linus's as he provides the official versions of the kernel. I would
argue that same applies to Biopython especially as there appears to be
single developers providing their own material to the single tree rather
than multiple developers working together. Part of that is legacy in
that the core bioinformatics in Biopython is rather complete.
> I'm most a fan of Bazaar VCS, especially given its great integration
> with Launchpad. If BioPython were to move to hosting its bugs on
> Launchpad (I believe importing from Bugzilla is possible), I think the
> benefit becomes significantly greater, due to the great ability to
> automatically associate branches/commits with bugs.
I don't find automatic association between fixes and bugs a reason to
change. In numpy's Trac system you can see which version where the bug
was closed.
> If BioPython
> chooses to stick with Bugzilla, that feature wouldn't be as useful. (I
> think the same could be said for using the GitHub + Lighthouse
> combination.)
>
> On that note, I do recommend making sure that the BioPython project
> moves the code to one of these "social coding" sites (e.g., GitHub,
> Launchpad, Bitbucket). They bring the "who's working on what" that's
> necessary for tracking the project as a whole.
>
> Finally, none of this is really technically challenging, just socially
> challenging: we have to find a consensus and then actually follow
> through and make the move. It's 2009; we need to say goodbye to CVS,
> acknowledge that we missed our time with SVN, and just go straight to
> a DSCM and a modern code tracking site.
>
>
I think that central question that is lacking so far is how will any of
these approaches work with what Biopython is, how Biopython operates and
what Biopython provides?
It is very easy to argue in general terms on how one system is better
than another - lots of web pages on that. But that does not address the
needs of the project as a whole. At present, you and others have not
specifically addressed how Biopython would benefit from this.
How do you maintain a stable tree that always should be correct and
addresses conflicts (like different coding style and semantics :-) )?
As with Linux, people do not scale, so the one of the main goals of any
system is that it should minimize effort of maintaining and producing
the stable release.
How does a user get the 'latest' version if they have bug? How do you
even know what version that actually have?
How do they avoid picking up other changes that a developer has made in
addition to that bug fix? (Not that any system is immune, like
developers adding unsupported dependencies or undefined variables as in
recent cases in numpy).
I also favor the centralized system because I am not a Biopython
developer but a tester. So getting the current version is essential to
do that and I do not want to have to pull other people's code in to do
that especially if it brings in new code not related to a fix. Nor do I
think that an extended period of pre-release testing is suitable for
Biopython.
Just some thoughts,
Bruce
More information about the Biopython-dev
mailing list