[Bioperl-l] Re: GSoC/BioPerl Reorganization Project

Siddhartha Basu sidd.basu at gmail.com
Fri Apr 29 02:15:01 UTC 2011


Hi Robert, 
At what point in flow the dependencies between the split modules will be
added. Is there any particular order the split modules would be created. 
And how those split off modules will be released in CPAN,  one by one as
they being generated or all of them in a batch after which they will
follow their release schedule. 

-siddhartha



On Thu, 28 Apr 2011, Robert Buels wrote:

> I think you guys are on the right track, here are some slightly more 
> detailed plans.  I'll use Chris's subject numbering.
>
> 1,2,3,5.) I envision the splitting algorithm going like this:
>
>      no strict; # this is pseudocode!
>
>      my $split_count = 0;
>      for $subsystem (qw( Bio::Root Bio::Das Bio::Event ... )) {
>
>          - take $subsystem modules and tests out of bioperl-live
>
>            (my $new_dist_name = $subsystem) =~ s/::/-/g;
>          - extract $subsystem modules into new dist called
>            $new_dist_name.  Make sure all its tests pass, and write
>            some more tests if necessary.
>
>          - add dep on $subsystem to bioperl-live/Build.PL
>
>          - push $new_dist_name and bioperl-live to CPAN.
>            $new_dist_name has version '2.000', and bioperl-live has
>            version "1.7.$split_count".
>      }
>
>      and then, at the end of this loop, bioperl-live will be
>      nothing but a Build.PL and a couple of other things
>      for backcompat, like Bio::Root::Version, Bio::Perl, etc.
>
>      Important things to notice about this algorithm are that, at each
>      step in the loop:
>
>         a.) For users that install bioperl with CPAN,
>             doing cpan 'Bio::Perl' or cpan 'Bio::Root::Version' will
>             get you the same set of modules as before the split
>             started, with the split-off modules at 2.000 versions, and
>             the non-split-off ones at 1.7.x versions.
>
>         b.) For users (not developers) that are git cloning
>             bioperl-live, even though they are naughty (wink), they
>             can do 'perl Build.PL; ./Build installdeps' to get the
>             split-off modules, downloaded like any other CPAN
>             dependency.  There may be some lag before the split-off
>             thing is downloadable from CPAN,
>
>         c.) For BioPerl developers, unless they are working on a
>             certain module, they should install the split-off modules
>             from CPAN like everybody else, and git clone only the piece
>             they are working on.
>
>         d.) The version of bioperl-live keeps increasing by 0.001 with
>             each split.  The systems that are split off have a 2.x
>             version number, each slightly different depending on when it
>             was split off.  After this point, their release schedules
>             and version numbers are independent of eachother and of
>             bioperl-live.  For Bio::Perl and Bio::Root::Version, the
>             things that stay in bioperl-live, installing the latest
>             version will get you all the split-off modules.
>
>
> 6.) (thorny circular dependencies and stuff)  Those will become quickly 
> apparent as this process proceeds.  They'll take some finesse and/or 
> ruthlessness and/or hacking to get around.  We'll burn those bridges as we 
> come to them.
>
> 7.) (git submodules) Git submodules probably won't be necessary, since at 
> each step in the process BioPerl devs can use ./Build installdeps or cpanm 
> --installdeps .  to install whatever the dependencies are for the piece 
> they are working on, whether it's bioperl-live (in the case of a module 
> that has not yet been split off), or one of the distributions that has 
> already been split off (in which case their improvements will probably be 
> releasable to CPAN immediately!).
>
> Lots of detail there.  I tried to make it structured and easy to skim 
> though.  Thoughts?
>
> Rob
>
>
>
> On 04/28/2011 02:04 PM, Chris Fields wrote:
> > Sounds fine; I think (as you indicate) we can deal with issues along the 
> > way.  Rob, anything to add?
> >
> > chris
> >
> > On Apr 28, 2011, at 2:53 PM, Sheena Scroggins wrote:
> >
> >> Chris,
> >>
> >> We haven't talked much about the versioning yet, but it will be on the 
> >> list to figure out asap.
> >>
> >> So far, the plan is to split out Bio::Root first, followed by a couple 
> >> modules that depend only on Bio::Root. The plan I proposed was Bio::Das, 
> >> Bio::Event then Bio::Location. Depending on how much time is remaining 
> >> for the GSoC project, the next to split out would be Bio::Factory and 
> >> Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I 
> >> plan to still help with the reorganization after the internship is over, 
> >> but I obviously have to have a stopping point for the GSoC project.
> >>
> >> Rob provide me with a really nice scrip to list dependencies of the 
> >> modules, so I plan to make a roadmap towards to end of the summer that 
> >> will help guide the rest of the reorganization. At that point, we'll have 
> >> to deal with the circular dependencies carefully.
> >>
> >> This is a huge project, much bigger than I can do in one summer. But I 
> >> plan to get it started in a way that makes it easy for others to 
> >> contribute.
> >>
> >> Sheena
> >>
> >>
> >> On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields<cjfields at illinois.edu>  
> >> wrote:
> >> Sheena,
> >>
> >> Congrats on being accepted! We've talked about doing this over the years, 
> >> but it's not an easy task and it needs a dedicated project to get the 
> >> ball rolling, so to speak.  Hopefully this isn't tl;dr.  I'll start off 
> >> with a few of my questions/thoughts (Rob could probably chime in as well, 
> >> but I think his general thoughts on the project parallel mine):
> >>
> >> 1) The current BioPerl CPAN could just be a simple install script, acting 
> >> like a 'Task' or 'Bundle' module, installing the actual Bio-specific 
> >> distributions.  Doing it this way would allow you to iteratively split 
> >> off additional code but retain the original Task/Bundle-based approach to 
> >> installation.  For instance, the first pass could split out Root, then 
> >> have a dependency-light and 'extras' distribution, 2nd round split 
> >> further based on function, and so on:
> >>
> >>   1st round (v 1.9)   :  BioPerl (just an installer) ->  installs root, 
> >> min-deps, extra-deps
> >>   2nd round (v 1.901) :  BioPerl (just an installer) ->  root, 
> >> seq/feature, other-min-deps, extra-deps
> >>   ...
> >>   Xth round (v 1.99)  :  BioPerl (just an installer) ->  root, tools, 
> >> seq, tree, align, coord, map, everything-else
> >>   ...
> >>
> >> Also, one could potentially install modules in various ways: 
> >> interactively, in predetermined groups, using a user-defined list, etc 
> >> (one could effectively create custom BioPerl installs for GBrowse or 
> >> other tools for instance).  Of course I would only pick the easiest route 
> >> to start, but maybe that gives some ideas.  Regardless, if the dependency 
> >> tree is set up correctly any reliance on other Bio* modules would be 
> >> defined in the various Build.PL/Makefile.PL and then installed via CPAN 
> >> (as is any dependency).
> >>
> >> 2) The Bio::Root modules are probably the true core modules and are the 
> >> most stable with regards to changes, so those could be moved to something 
> >> like BioPerl-Core.  Beyond that, what are the proposed splits?  (we've 
> >> discussed this on-list before, but it's appropriate to bring this up 
> >> again)
> >>
> >> 3) How do we want to handle versioning?  We can't (and probably 
> >> shouldn't) release everything on a synchronized versioning scheme (via 
> >> Bio::Root::Version, for instance), that'll quickly fall apart.  
> >> Personally I can foresee each split-off dist having it's own version, 
> >> with the BioPerl network of modules being in effect it's own mini-CPAN.
> >>
> >> 5) Related to versioning, in my opinion we should maybe aim on eventually 
> >> calling this BioPerl v2.0 and starting with a simpler X.Y versioning 
> >> scheme.  Lincoln has already done something like this with Bio::Graphics, 
> >> which was originally part of BioPerl but split off prior to v 1.6.0.
> >>
> >> 6) In some cases I can see particularly thorny problems, such as circular 
> >> dependencies.  I can think of a few ways to address that (creating a 
> >> simple lightweight Bio::Species class as a fallback if Bio::Tree code 
> >> isn't present, for instance), but any additional thoughts on this would 
> >> be helpful.
> >>
> >> 7) Do we want to set up something like 'git submodule' for the devs to 
> >> pull down all BioPerl-relevant code?
> >>
> >> Other thoughts?
> >>
> >> chris
> >>
> >> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote:
> >>
> >>> Hey everyone,
> >>>
> >>> I wanted to take a minute to introduce myself as one of the Google 
> >>> Summer of
> >>> Code interns. I was the lucky one chosen to work on the BioPerl
> >>> Reorganization (*crowd cheers*). I am a grad student in bioinformatics, 
> >>> and
> >>> somewhat new to this level of programming so bear with me as I learn the
> >>> technical jargon. Luckily I have both Rob and Chris to mentor me this
> >>> summer!
> >>>
> >>> Reading through the mailing list archives, I see there have been many
> >>> discussion and differing opinions about tackling this project. Given the
> >>> time frame for GSoC and my limited experience, there is no way I will
> >>> complete this project on my own but I will at least be able to start it,
> >>> which will hopefully motivate others to pitch in. So far, the plan for 
> >>> the
> >>> GSoC project is to start by breaking out Bio::Root, followed by a couple
> >>> other modules based on their dependencies and the time allowed. Each 
> >>> will be
> >>> published to CPAN independently. You can follow the project (once it 
> >>> starts)
> >>> on github at https://github.com/sheenams.
> >>>
> >>> I look forward to collaborating with many of you on the reorganization 
> >>> (hint
> >>> hint)!
> >>>
> >>> Sheena
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list