[Bioperl-l] Splits again

Thu Jun 28 15:37:27 UTC 2007

On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> ...
>
> The short and sweet version: my proposal has all the benefits of  
> yours, but none of the disadvantages. What's not to like?

The short and sweet version: I'm more convinced after you laid out  
your argument in detail, which would have saved me some typing last  
night, BTW, thanks! ; >

The other core devs need to chip in and we need to openly (candidly)  
discuss it some more (I've added Hilmar to this).  There is also a  
tenable solution that allows both aspects ('cliques' and single mode)  
which might make everybody happy.

Let's say we only want to install Bio::SeqIO::genbank.  The  
Bio::SeqIO::genbank Build.PL would only install what was needed (as  
you indicated), only Bio::SeqIO::genbank-related tests would run  
(along with dependency test, if available), and life would go on.   
However, what if we wanted to install everything in SeqIO/DB/AlignIO/ 
etc?

We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO  
modules installed or a select few (maybe a quick 'install all (y/n)?'  
followed by a list, which installs them one at a time along with  
dependencies), or have the option to specifically denote them as  
passed args to SeqIO's Build.PL, something like 'perl Build.PL - 
install-plugins genbank embl swiss', 'perl Build.PL -install-plugins  
all', etc.  If a specific module (Bio::SeqIO::genbank) is installed  
directly then maybe the installation q&a's of followed modules could  
be bypassed when installing down the dependency tree with additional  
passed args.

This would, in effect, be a bioperl-specific mini-CPAN within CPAN.   
Nice!

Now, this doesn't address several related issues, such as how we  
handle versioning of the independent modules (should be in a  
controlled manner), what we do about deprecated modules which linger  
about on CPAN, how we deal with PPMs/RPMs/packaging, and so on.  All  
have possible reasonable ways they can be addressed, I believe.   
Also, I think we should still think about doing regular full-scale  
'stable' (1.#) releases (sort of our stamp of approval for that batch  
of modules at that point in time, with a reasonable 'sell-by' date).

Again, it should be seriously discussed among the core devs and the  
bioperl community at large prior to any serious work on it, and it  
would be quite a large-scale project, but possibly worth it.  It can  
only go forward if there is enough momentum behind it.

>> Finally, all of this should wait until later.  Much later, like  
>> after  a decent release, after svn, etc kind of 'later'.  I think  
>> we can  agree on that.
>
> Hmm, not really. If it can be implemented by a change in just  
> Build.PL and ModuleBuildBioperl, its really independent of  
> everything else. That's the beauty of it: the only thing that  
> changes is how things are uploaded to and downloaded from CPAN. The  
> only person that normally deals with that issue is the pumpkin for  
> a release, and he only cares about it at release time.
>
> In fact, if we're going to do it at all it makes sense to try it  
> out on a minor release like 1.5.3. We've already got experience of  
> doing it split-style from 1.5.2. (And let me tell you: splits at  
> the code-base level suck.)

BOSC is coming up, and I would like to focus on getting svn migration  
taken care of ASAP (which is sounding more and more like we plan on  
moving all open-bio over, unless I misread Jason's post?) and  
stomping of bugs (my next priority after EUtilities).  Maybe in the  
interim we should try focusing on bug squashing, get out a quick  
standard dev release (1.5.3) before BOSC, and then a few of us could  
all communicate there via email/text/IM/phone off-list?  Maybe post  
updates via the bioperl blog and list?

> And where is the harm in letting them do it via CPAN as well? In  
> fact, there are significant benefits:
...

I'm already pretty convinced...

> The same can be achieved with CPAN bundles for each kind of  
> functional grouping you can think of. And since its just a single  
> text file that defines such a grouping, its easy to change or add  
> new ones as you feel like it, as opposed to the rather more  
> permanent and substantial effort of creating one of your splits on  
> the code-base level.

... or it could be run right in Module::Build for specific parent  
classes (as I mention above).  Bundling could be instituted for  
something like a standard GBrowse release (Bundle::BioPerl::GBrowse)  
where the functionality might be more spread out (Bio::DB*,  
Bio::Graphics, Bio::FeatureIO, etc).  For a full-scale old-style core  
install, another Bundle (Bundle::BioPerl::Standard).

...

> Yes, it would be automated, and no, it wouldn't at all be any kind  
> of additional headache. I'm proposing a fully-automated system that  
> the pumpkin wouldn't even have to think about it. Much /less/ of a  
> headache than dealing with splits. Orders of magnitude easier to  
> deal with.

The 'headache' would be the initial setup (splitting test, individual  
Build.PL, etc), but this could be done stepwise or section-wise, I  
suppose.
...

> And the smallest, most concentrated set of modules is the  
> individual module.

Well, only if it runs correctly (i.e. has the entire dep. tree  
installed).  But the 'follow' tests would handle that.

> The reason some of these existing splits (micoarray, ext) have  
> fallen by the way-side? /Because/ they're splits. If they had been  
> part of bioperl-live all along, they'd have been kept in a working,  
> compatible state and would have been released along with everything  
> else in 1.5.2

microarray fell out of favor for other reasons (much faster ways to  
do the same thing via R), though I think it still could be salvaged  
if someone wanted to take it up.

the other bioperl distros (network, db, run, etc) would also  
necessitate following the same path as core, but I guess they could  
be bundled as well.

> ...
> No headaches.

I already have one, sorry!

chris