[Biopython-dev] SVN migration and Launchpad mirroring

Mon Feb 9 16:59:36 UTC 2009

On Mon, Feb 9, 2009 at 5:04 PM, Bruce Southey <bsouthey at gmail.com> wrote:
> Chris Lasher wrote:
>>
>
> Do you control your own project with multiple developers?
> If so, how do you ensure which is the standard version and address
> conflicts?
>
> While I understand the advantages of distributed option, I do not see the
> end result any different between a distributed and a non-distributed version
> control system.

Let's say I want to develop a new module to read fasta sequence,
alternative to the current one.
With a DVCS, I would fork the official biopython branch, and start
working on it.
While I am changing things and committing everything to my private
branch, the official biopython developers keep committing changes, on
the official branch.
When I will be sure that my SeqIO personalization is ready, I will
send a merge request to you, and it will be easy to know:
- which was the exact version and code of biopython when I created my branch;
- which commits have been made in the official branch while I was
working on mine, so it will be easier to determine how to merge them;
- moreover, if my changes will be accepted, the whole history of my
private branch will be included in biopython (and it could be useful).

Imagine how to do the same with a normal VCS.
It would be similar: I would create a local copy of biopython on my
computer, and start working on that (since I don't have access to the
official repository).
When my new module will be ready, I will send the changes to the
official biopython branch through bugzilla: the problem is that then,
we will have lost the information on which was the version of
biopython when I created my local copy, and it will be more difficult
to merge it.

Have a look at this post:
- http://github.com/blog/39-say-hello-to-the-network-graph-visualizer

> Even in Linux, the only 'tree' that counts is Linus's as he
> provides the official versions of the kernel. I would argue that same
> applies to Biopython especially as there appears to be single developers
> providing their own material to the single tree rather than multiple
> developers working together. Part of that is legacy in that the core
> bioinformatics in Biopython is rather complete.
>
>> I'm most a fan of Bazaar VCS, especially given its great integration
>> with Launchpad. If BioPython were to move to hosting its bugs on
>> Launchpad (I believe importing from Bugzilla is possible), I think the
>> benefit becomes significantly greater, due to the great ability to
>> automatically associate branches/commits with bugs.
>
> I don't find automatic association between fixes and bugs a reason to
> change. In numpy's Trac system you can see which version where the bug was
> closed.
>
>>  If BioPython
>> chooses to stick with Bugzilla, that feature wouldn't be as useful. (I
>> think the same could be said for using the GitHub + Lighthouse
>> combination.)
>>
>> On that note, I do recommend making sure that the BioPython project
>> moves the code to one of these "social coding" sites (e.g., GitHub,
>> Launchpad, Bitbucket). They bring the "who's working on what" that's
>> necessary for tracking the project as a whole.
>>
>> Finally, none of this is really technically challenging, just socially
>> challenging: we have to find a consensus and then actually follow
>> through and make the move. It's 2009; we need to say goodbye to CVS,
>> acknowledge that we missed our time with SVN, and just go straight to
>> a DSCM and a modern code tracking site.
>>
>>
>
> I think that central question that is lacking so far is how will any of
> these approaches work with what Biopython is, how Biopython operates and
> what Biopython provides?
> It is very easy to argue in general terms on how one system is better than
> another - lots of web pages on that. But that does not address the needs of
> the project as a whole. At present, you and others have not specifically
> addressed how Biopython would benefit from this.
>
> How do you maintain a stable tree that always should be correct and
> addresses conflicts (like different coding style and semantics :-) )?
> As with Linux, people do not scale, so the one of the main goals of any
> system is that it should minimize effort of maintaining and producing the
> stable release.
>
> How does a user get the 'latest' version if they have bug? How do you even
> know what version that actually have?
> How do they avoid picking up other changes that a developer has made in
> addition to that bug fix? (Not that any system is immune, like developers
> adding unsupported dependencies or undefined variables as in recent cases in
> numpy).
>
> I also favor the centralized system because I am not a Biopython developer
> but a tester. So getting the current version is essential to do that and I
> do not want to have to pull other people's code in to do that especially if
> it brings in new code not related to a fix.  Nor do I think that an extended
> period of pre-release testing is suitable for Biopython.
>
> Just some thoughts,
> Bruce
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

-- 

My blog on bioinformatics (now in English): http://bioinfoblog.it