[Bioperl-l] Re: Bio::FeatureHolderI interface confusion

Lincoln Stein lstein at cshl.edu
Wed Jun 18 19:17:17 EDT 2003


Hi,

Open Source projects are a bit weedy.  You have to prune them back regularly 
in order to avoid them spreading out of control.  The original Bio::Seq* 
classes grew out of one use case -- reading and writing flat files -- and 
have become rather mangy as they've been repurposed to meet new needs.

I find Ewan's separation of bioperl clients into "users" and "developers" a 
little artificial.  We'd like to see our customers begin as users and 
transition into developers as they gain experience.  Well-designed and 
documented code serves everyone equally well.

My experience 2 years ago in coming back to bioperl after having been away for 
the better part of a decade was not that the interfaces were confusing or too 
numerous, but that they were the *wrong* interfaces for my applications.  For 
example, I wanted to be able to work with sets of sequence features without 
necessarily having an instantiated sequence string around.  I wanted to be 
able to perform coordinate arithmetic with features so as to locate one 
feature relative to another.  I wanted to be able to produce a graphical 
rendering of a feature completely generically.  None of this was possible at 
the time.  So I did the obvious thing, and created subclasses which 
implemented the methods I needed, and as an afterthought created interfaces 
to describe what I had done.   Lo!  A new weed was born.

The problem is that everyone has had the same experience, has extended the 
library, and has added ad hoc and inconsistent interfaces.  Now our garden is 
overrun.

The solution is to go back to the use cases, figure out what types of problems 
we want the modules to address, and then design a small number of interfaces 
that do the job.  We should either use Damian Conway's Class::Contract to 
enforce our use of the interfaces at compile time, or use Paul's proposed 
@ISA tree walker to regression test that each required method is implemented 
(although this is harder to do than it looks, and I'd like to see how this 
works).

We started an informal redesign process a couple of months ago in a series of 
e-mail exchanges with Paul, Ewan, Hilmar and Aaron and I think we made some 
good progress towards the outlines of a "Bioperl 2."  It hasn't gone very far 
since then, and I guess the question is how public a process this type of 
redesign should be, and how to manage the various competing needs?

Lincoln

On Wednesday 18 June 2003 03:26 pm, Paul Edlefsen wrote:
> I've been keeping silent on this (didja notice?), but as Ewan predicted, I
> have views here.
>
> The idea of protocols -- per-method contracts -- intrigues me; Perl offers
> the can() facility, which could be used here.  I personally have not
> experienced any necessity for this, but I'm willing to believe that y'all
> have.
>
> Perl, as we are painfully aware, does not enforce contracts.  In my
> experience as a Java developer and a bioperl developer I have come to
> appreciate the necessity of enforcing contracts; my interpretation of the
> complexity of bioperl -- the near impossibility of treating it as a
> component model -- is that a failure to enforce interface contracts is the
> principal stumbling point.
>
> It has been mentioned in this thread that complex interfaces are devised
> and then ignored.  I suspect that the concept of enforcing these interfaces
> is terrifying at first blush: does that mean that I have to actually
> implement all of this crap?  There's a couple of points to consider before
> dismissing enforcement, though: 1) if we used interfaces correctly then
> they would not be impossible to implement; 2) interface contracts may allow
> null-responses (eg. if a SeqFeatureI isa FeatureHolderI it does not
> *necessarily* contain subfeatures, but you can ask it how many subfeatures
> it has (it has 0); although FeatureHolderI presently asserts the
> often-un-supportable contract that FeatureHolderI implementers always
> accept the addition and removal of subfeatures, this is not a failure in
> Object Oriented Programming, it is a failure in our FeatureHolderI contract
> design).
>
> I have personally consolodated the FeatureHolderI variants, so I'm pretty
> familiar with this particular area of the bioperl library.  I found that
> this contract is duplicated (eg. GFFI, DasI, FeatureHolderI,
> SeqFeature::CollectionI), unused (eg. CollectionI), and ignored (eg.
> gbrowse accesses all feature providers as if they are GFF.pm).  On what we
> affectionately refer to as the 'freaky dev branch', branch-1-2-collection,
> I have unified these things into one interface, called
> Bio::SeqFeature::CollectionI (which inherits, for backwards compatability,
> from FeatureHolderI).  SeqFeatureI is capable of holding subfeatures, so on
> this branch Bio::SeqFeatureI isa Bio::SeqFeature::CollectionI.  I have also
> made a version of gbrowse that uses this interface, as well as a data
> provision interface called Bio::DB::FeatureProviderI, that fetches feature
> collections from a backing store.
>
> I agree that the interfaces presented to novices should be few and simple
> and straightforward.  I do not think, though, that the interfaces presented
> to programmers need be otherwise.  I am not the best designer of
> interfaces, and those that I have designed might not be the best solution,
> but if we as a community can commit to the concept that an interface is an
> inviolable contract and that use of interfaces is prerequisite to
> component-oriented development, then the failures in an interface will lead
> not to its violation, ignorance, or duplication, but to its correction.
>
> I can see that the culture of bioinformatics software development is
> presently disinclined towards self-enforcement of software design
> contracts.  We will not abandon bioperl, though; the only direction forward
> (IMO) is through some sort of refactoring.  If we cannot rely on
> contributors to enforce interface contracts, could we perhaps enforce them
> through some software solution (in Java or C++ this is the compiler's job)?
>  Like runnable synopses, could we not test *on checkin*, or at least in the
> test suite, that interface contracts are enforced?
>
> If anyone is interested, to this end I have created a very small number of
> initial interface tests, in the ti directory, on the freaky dev branch. 
> The ultimate idea is that the ISA hierarchy will be climbed and anything
> claiming to support an interface will have to pass the test corresponding
> to that interface.  This is, after all, what ISA means: it is safe to think
> of me as a BLAH.  Why not test this?
>
> Okay, thanks for reading my rant.
>
>   :Paul

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein at cshl.org			                  Cold Spring Harbor, NY
========================================================================




More information about the Bioperl-l mailing list