[Bioperl-l] Google Summer of Code - BioPerl proposals

Fields, Christopher J cjfields at illinois.edu
Mon Apr 1 22:23:45 UTC 2013


On Apr 1, 2013, at 12:17 PM, Carnë Draug <carandraug+dev at gmail.com> wrote:

> On 1 April 2013 04:28, Fields, Christopher J <cjfields at illinois.edu> wrote:
>> On Mar 31, 2013, at 9:05 PM, Carnë Draug <carandraug+dev at gmail.com> wrote:
>> 
>>> On 1 April 2013 01:34, Fields, Christopher J <cjfields at illinois.edu> wrote:
>>>> I agree.  Another approach might be to cleave off a section that you could mould into your own; this could be done for bioperl-run, bioperl-live, etc.
>>> 
>>> Why did the project ran out of time 2 years ago? The blog posts about
>>> it are very few and don't sound too bad. It mentions having prepared a
>>> couple of them, but none was actually ever released. Instead, the
>>> source was also kept in bioperl-live and seems to have already
>>> branched. Is there any reason for this? It was my understanding that
>>> splitting the project is still desirable, from a discussion back in
>>> February
>>> 
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/26395
>>> 
>>> it just happens that no one has picked it up yet.
>> 
>> The project actually made a lot of headway; the particular pieces moved out (Bio::Root, Bio::Factory, etc) worked fine, but we never followed up on exactly what to do next on master branch.  It's perfectly feasible for someone to go ahead and finish the initial part of that (in fact, I believe there were some branches that started along this path but never merged back in).
> 
> Can I merge any branching between these and bioperl-live and set them
> up so you only have to run dzil on their repos?

I wouldn't worry about the branches, they are probably too stale.  Have it so dzil works for the various repos from that project (it should already).  We will likely need to think about having a stub Build.PL that can be used for basic installation, but would be auto-generated based on the needs for that repo (and so shouldn't be committed to).  This is mainly to help git-savvy users, not devs; we don't necessarily want users to install dzil, which had somewhere north of 40 or so dependencies IIRC.

>>> I think splitting bioperl-live into subdistributions and make a new
>>> 1.70 release of each of them is perfectly doable over a summer. And I
>>> say this after having split and release Bio-Biblio. This is one of my
>>> itches with BioPerl. I have been using it for almost 3 years, but have
>>> never seen a release. I would like to make new releases of everything,
>>> no changes at the start, but take them to the point that "dzil
>>> release" does everything. Make it really easy for anyone to come in
>>> and contribute and even easier for a maintainer to make a new release
>>> after receiving a contribution. Is this desirable for the project?
>>> 
>> 
>> Hilmar's point is pretty valid, namely that a case would have to be made as to why the initial run at it wasn't completed, or why it would work better this time.  We're not suggesting that this can't be done, but the above point would have to be answered.
> 
> The only reason why I claim to be able to finish this is that I'm very
> well familiar with both BioPerl and the tools to make the split. Plus,
> I already split one (and trying to split another) to get a clear idea
> what it involves.

Right, I do think it's feasible.  But see Hilmar's response on this point; you don't have to convince us.

>> Frankly, the project has been pretty reliant on me for releases, so it's perfectly valid to point out the modules haven't made it out yet b/c I haven't made a release since then.  From that point of view, this would be a continuation of that work, maybe with the intent/focus on making code releases much easier.
> 
> As a maintainer of another FOSS gigantic project that is also a
> collection of libraries, I can relate to this. Of course it can be
> much more interesting to write new sexy code and add it to the huge
> pile of modules already in bioperl-live but I want to make it easier
> for others to develop on BioPerl. Comparing with chemistry, I want
> this to be the equivalent of a catalyst for the development, rather
> than another reactant.
> 
>> Regarding updating Bioperl to use Dist::Zilla amongst other modern perl tools (Moose included), yes, it is very much our wish/intent to have this, in any way possible.  But I don't think we can call it BioPerl v1.7, simply based on past release cycles; we're somewhat bound by deprecations, etc.  We really need a clean break.
>> 
>> So, my general feeling is that while we are cleaving out code and releasing the independent dist and core, we should re-christen core as 1.9 (e.g. pre-v2).  We move to v2 when we feel we're at the right point.  Each of the individual distributions would have to start with their own versions, anything greater than the point where they left the core/live distribution should work.  I agree with you in that I don't think it would take a long time, but we also have bioperl-run in the mix (and in many cases it would make sense to combine wrappers with the proper parsers), so simply cleaving out from one repo may not be the best approach.
>> 
>> With that in mind, my point was meant to indicate we can also start afresh with a section of the code that you would like to focus on, using some of the same ideas (pulling out the relevant modules you want to work on).  This might be an attainable goal in the minds of GSoC reviewers and might suit your particular needs (for instance, if you had a research project reliant on such code).  I'm supportive either way, and I don't think you'll have a problem finding a mentor if you need one.
> 
> I suggested 1.70 only because it has no change. And it won't be
> BioPerl 1.7. It would be Bio-Seq, Bio-Align, Bio-Popgen, etc v 1.70.

There may be a point where we will likely find it hard to split out more w/o running into circular dependency issues.  This will likely center around Bio::Seq, Bio::SeqFeature, and Bio::Annotation (with others thrown in).  But let's see how far we can go with it.  If we get to a point where division becomes problematic, we can deem that 'core'.  But I would like to see Bio::Seq etc in their own space.

Re: versioning: I'm not particularly hung up on any particular versioning scheme, but the key point is support.  It's easy for me to say "as of bioperl v2 the installation scheme will be something completely different" as opposed to doing so with v1.7.  Will installation of v1.7 be the same is it was for v1.6 (or even similar)?  Will it install the same modules by default?  We would be changing a key step in using BioPerl (installation) w/o much warning.  

> These smaller distributions can then stay as they are or evolve into
> 2.0 if their maintainers are so interested. I saw biome and liked it,
> but is the plan to make a BioPerl 2.00 written in Moose?

Not necessarily, unless it can be demonstrated to help considerably.  I think it can FWIW.  

> Won't that
> path take us to the same place we are now in a couple of years? Won't
> it be better to make the split now, and make the clean break on each
> smaller distribution?

Right.  Exactly. (the latter point :)

> Would you be available to talk about this on #bioperl? I'm online
> there most of the time.
> 
> Carnë

I'll join in tomorrow, sure. I may be on and off channel due to meetings.  

chris



More information about the Bioperl-l mailing list