[Biopython-dev] biopython on github

Peter Cock p.j.a.cock at googlemail.com
Mon Mar 23 10:14:10 UTC 2009


On Mon, Mar 23, 2009 at 9:02 AM, Leighton Pritchard <lpritc at scri.ac.uk> wrote:
> On 21/03/2009 14:40, "Giovanni Marco Dall'Olio" <dalloliogm at gmail.com>
> wrote:
>
>> Have a look at this video, where it shows that the Ruby On Rails
>> project has grown quicker when it has moved to github:
>>
>> - http://python.genedrift.org/2009/03/15/ror-commits/
>>
>> (the jump should be on minute 5.10 or so)
>
> I've seen this argument a couple of times, now - mostly on blogs - and I'm
> not sure that it's all that clear-cut.
>
> The RoR video shows a greater number of individual names associated with
> commits, after the move to github.  However, it's not clear whether this is
> because a large number of individuals have suddenly decided to contribute to
> the project, or whether the move to a version control system in which author
> attribution remains with contributed code - as opposed to the bottleneck of
> having to be submitted with the id of someone with write access - is
> responsible.  I don't think there's enough evidence to say 'the move to
> github caused an increase in contributions'.
>
> As a counter-example, the number of people who have recorded contributions
> to Biopython code is 46 (from the CONTRIB file on CVS).  I don't think that
> there are that many ids associated with committing the codebase on there.
> My name's only associated with GenomeDiagram in the commit comments, not as
> an author/committer of the code - at least, as far as the CVS application is
> concerned - for example.  This might change with git.  Of course, I might be
> misunderstanding git's attribution model, or how the stats for that RoR
> video were compiled...

Leighton has a good point about the attribution, and the dangers in
over interpreting such a video.  With git/github it will make it
easier to see who contributed patches (if a patch is pulled into
another branch, both the person doing the merge and the person who
originally checked in the patch get recorded), and that may indirectly
encourage more contributions.  As Leighton points out, we do try and
give credit now in CVS commit comments, but these are checked in by a
core developer.  I imagine this happened with RoR, but compiling this
information for that video would probably have been too much work.  As
well as switching tools, you are also changing the metric.

Something else to consider is how you are measuring activity: the git
and github documentation and press encourages people to commit more
often - for example while working on a bug fix or a new feature, I
might make three incremental commits on my local copy of the
repository, before I am happy enough to push this to the online
repository.  This would then show as three commits, wouldn't it - but
on CVS it would probably be just one.   i.e.  On CVS I suspect you
naturally tend to get a smaller number of larger commits than with
git.  This difference will probably vary from person to person - I
haven't counted or anything, but with CVS I think I tend to commit
lots of smaller changes, while Michiel for example tends to make fewer
but larger commits).  i.e. If the RoR video shows a sudden jump in the
number of commits, that doesn't mean more code changes.  Scaling by
number of lines changed would be another metric which is perhaps more
robust - but has drawbacks of its own.

Peter




More information about the Biopython-dev mailing list