[Bioperl-l] Priorities for a bioperl-1.6 release

aaron.j.mackey at gsk.com aaron.j.mackey at gsk.com
Tue Mar 18 12:23:41 EDT 2008


Very cool.  I can envision this being printed as a laminated poster to put 
up next to the periodic table of Perl Elements (
http://www.ozonehouse.com/mark/blog/code/PeriodicTable.html)

One GraphViz trick you could try would be to group Bio::X::* (nodes and 
your collection groups sharing common Bio::X:: prefixes) together as 
subgraphs; that should quickly show you which edges go outside of the 
various "domains", and which are entirely self contained.

you could also try to distinguish "use base" relationships (i.e. 
inheritance) vs. "use Bio::X" (delegation, composition, etc.) vs. "require 
Bio::X" wrapped in an eval (optional use if available) by various edge 
colorings -- this might help to further break things up if we can guess at 
the intended "use" of any Bio::X by Bio::Y.

-Aaron

"Sendu Bala" <bix at sendu.me.uk> wrote on 03/18/2008 11:32:25 AM:

> aaron.j.mackey at gsk.com wrote:
> >> Or is the split intended to be 'core' == "anything and everything
> >> that was in 1.4", '????' == "everything else"? In which case,
> >> what's a good name for "modules created after 1.4"? 'crust'? ;)
> > 
> > Nah, "icing".
> > 
> > a module "use" map might be very useful to help identify "core" vs.
> > other layers of mantle/crust/icing.
> > 
> > http://www.perlmonks.org/?node_id=87329 
> > http://search.cpan.org/src/NEILB/pmusage-1.2/
> 
> Thanks for those. Neither could quite cope with BioPerl, but I've munged
> them together and hacked up 'module_usage.pl' which I've just committed
> to the maintenance directory of bioperl-live.
> 
> module_usage.pl ../Bio
> 
> Produces:
>   *warning, may crash your browser; download it and view in a dedicated
> image viewer*
> http://bix.sendu.me.uk/files/module_usage.jpeg
> http://bix.sendu.me.uk/files/module_usage.txt
> 
> First I considered what modules each BioPerl package (aka class, module)
> 'uses' (what modules does it load via 'use', 'require' or inherit from
> via 'use base', excluding external (non-BioPerl) modules), then grouped
> together packages that have identical usage. The graph shows all the
> groups with more than one member as nodes and edges from them pointing
> to the individual packages that they use. The set of those individual
> packages pointed to by groups also have edges showing their
> use-relationship to other members of the set (only). Members of the set
> are also shaded in red. The saturation of the shade indicates how many
> packages use that package (so dark red packages are used a lot).
> 
> (I had to simplify in this way because otherwise GraphViz bailed on me.
> If anyone can come with nicer simplification/visualisation systems,
> please do! It's important to note that there is lots of information loss
> in my scheme, so you can't rely on the graph alone.)
> 
> Getting to the question on how to decide what is 'core' and on what
> basis to split things up, first consider the darker red packages. Next
> consider how many groups point to it. Finally consider the membership of
> those groups: are they all highly related, or are they from different
> 'parts' of BioPerl?
> 
> For example, Bio::Graphics::Glyph::generic is dark red and has 3 groups
> pointing to it, but all the members of those groups are
> Bio::Graphics::Glyph*. You could imagine that Bio::Graphics::Glyph (or
> Bio::Graphics?) could be split off cleanly if desired and not kept in
> core. Bio::SimpleAlign, on the other hand, whilst not being quite as
> dark a red, has 7 attached groups with members from Bio::AlignIO,
> Bio::Search and Bio::Tools. You could easily argue it is more
> fundamental to BioPerl and should be in core. In turn, the things that
> Bio::SimpleAlign points to would also have to be in core.
> 
> I haven't done any full analysis along these lines and leave as an
> exercise for the interested reader for now ;)
> 
> 
> Chris Fields wrote:
> > http://www.bioperl.org/wiki/Talk:Proposed_1.6_core_modules
> > 
> > I'm pretty flexible on any of that; it's a proposal only and I think
> > some of it may be wrongheaded, but hey, I'm willing to take a few
> > rotten tomatoes.  The key issue is we should try to work out what we
> > mean by 'core' or the core library.  I have a rather extreme view of
> > it as being the bare essentials without external, non-perl core
> > dependencies (only SeqI/PrimarySeqI, AlignI, AnnotationI, SeqFeatureI
> > and required modules for those classes) but I'm sure others would
> > lump in parsers, DB functionality, etc.  I basically suggest placing
> > those (and any stable but potentially non-core code) in a
> > 'bioperl-main', with any unstable or untested code going into a
> > 'bioperl-unstable'.
> 
> My thoughts are along these lines:
> # I agree that core should have no external dependencies
> # I agree that it might mostly be interfaces
> # It should represent a framework with all the interfaces (that have
>    stable APIs), directory structure and base classes that everything
>    else relies on
> # It might not do much useful bioinformatics, but provides just about
>    everything needed for a dev to create a new module that does
> 
> 
> > In essence, bioperl-main would require core and resemble a stable
> > release; bioperl-unstable would require bioperl-main (and core) and
> > resemble a dev release.  Not sure how versioning would go or if this
> > is a viable option at all, but it's worth discussing.
> 
> # I agree that this 3-way split seems reasonable
> # bioperl-main would consist primarily of the 'leaves' of the module
>    tree, mostly parsers and the like which, whilst 'stable' and tested
>    should still be split away from core because the data sources they
>    parse could change format slightly
> # bioperl-unstable, better bioperl-bleed, would feature brand-new
>    stuff, be it new parsers for totally new formats, new APIs that do
>    something not thought of before etc. When they are complete, bug-free
>    and have stood the test of time they get moved into bioperl-main.
>    (It is not a place for all new commits; bug fixes to something in
>    bioperl-main would be committed to bioperl-main)
> # The current splits (bioperl-run, bioperl-network etc.) do not get
>    their own core and bleed variant. Anything they need for core
>    functionality would enter the single bioperl-core, anything new
>    would enter the single bioperl-bleed, and anything stable would
>    be in their own bioperl-[package]
> 
> Discuss :)
> 




More information about the Bioperl-l mailing list