[Bioperl-l] Bio::FeatureHolderI interface confusion

Hilmar Lapp hlapp at gmx.net
Wed Jun 18 17:02:43 EDT 2003


On Tuesday, June 17, 2003, at 04:18  PM, Chris Mungall wrote:

>
> I don't understand what you are saying. Surely it is impossible to 
> ignore
> the interfaces since they are key to understanding how the bioperl 
> object
> model works?
>

Aaron not so long ago opened a drive to achieve a better separation 
between interfaces for the geeks and a more streamlined (easier 
understandable) object model for actual users. One of his ideas was to 
move all interfaces into one directory; the user supposedly would then 
be able to understand and work with the object model without looking 
into that directory ...

This obviously wouldn't dissolve interfaces for developers that 
contribute to core areas. I'd maintain though that for that group of 
people having a good understanding of the key contracts (as expressed 
in interfaces) is not really avoidable.

> Within the "rant" is a valid question:
>
> can I take the features coming out of SeqIO produced Seqs and call 
> methods
> such as get_SeqFeature and add_SeqFeature on them? Presumably no, as 
> this
> completely violates the contract. If I want to start adding 
> subfeatures,
> what am I supposed to do? You answered that question below - I test 
> using
> an isa() call. Ok, that makes some sense.
>

Right.

Note though that with 'add_SeqFeature' you just hit another issue about 
which (due to lack of consensus or whatever) we aren't consistent nor 
clear in documentation, namely what are the methods to modify 
properties. The interface methods only demand getters, not mutability. 
We've had, quite understandably, users on the list who got confused 
about that.

One resolve that I've started to follow is to at least document on the 
interface how you can expect to be able to modify the property if the 
object is mutable. As a matter of fact, I can't tell you off-hand a 
bioperl object that is immutable though, and we aren't going to have a 
lot I suspect. So it's an over-complification if you boil it down.

I'd be happy to support whatever effort to simplify this, e.g. simply 
add the setters to the contracts.


> There is also the other question I posed:
>
> If I have a module that builds nested Bio::SeqFeature::Generic objects,
> and I want users of that module to use these features as nested 
> objects,
> how do I describe these features in my interface/contract?
>
> If I say they conform to Bio::SeqFeatureI then no one can get the 
> nested
> features without violating the contract. If I say they conform to
> Bio::FeatureHolderI, then no one can use the feature specific methods. 
> Can
> I say they conform to both?

Yes.

>  This seems inconsistent with most formal
> interface specification descriptions, such as java, where I would have 
> to
> create the cross-product, Bio::SeqFeatureThatIsAHolderI.

No you wouldn't. This is java's multiple inheritance, so a Java class 
(and interface) can implement any number of interfaces (but only extend 
a single class).

>  Should I do this?
>
> In fact, here's a simpler example. I want to create a Bio::SeqI object
> that contains nested features. How do I publish that as a contract? a 
> SeqI
> is a FeatureHolderI for SeqFeatureI objects. Are the users of my module
> meant to guess that the features are nested? How do I specify that
> *formally* in my module?

You can't. It's perl :-)

What I mean is this is a contract that mostly binds by documentation 
only, and as you correctly pointed out one should make no mistake of 
expecting this to be enforced by the compiler.

Aaron mentioned a few modules in CPAN that attempt to enforce 
contracts, I forgot them. Aaron or I can dig those up if you're 
interested. AFAIR the enforcement is limited though so there wasn't a 
big enthusiasm on moving more in that direction.

Do you want formal expression of contracts the adherence to which is 
then enforced?

>
>> Your point about when and where are features returned that can nest is
>> certainly true. Note also though that what you guys are trying to do 
>> is
>> the first use case pushing this envelope. None of the SeqIO parsers 
>> will
>> produce nested features - so people don't necessarily missed out on 
>> that
>> yet. In fact we had people on the list asking why the list of
>> subfeatures is empty. I anticipate you guys to find more things that
>> aren't very sensible here.
>
> chadoxml parsers would produce nested features, so this is incorrect.
>
> Also, I'm not sure where the GFF3 parser would sit in the whole *IO
> framework (I don't see why this shouldn't go in SeqIO as a GFF3 file 
> will
> produce the same kind of data as a genbank parser). GFF3 certainly uses
> nested features.

I was saying that since chadoxml (and possibly GFF3 IO) is there there 
are issues that weren't necessarily that obvious before.


>
>> My personal view is biased towards more interfaces as will be no
>> surprise. Interfaces in perl to me have nothing to do with compile 
>> time
>> whatever, they mostly provide for contracts. Classes change and expand
>> all too quickly, and I've experienced that people quickly forget what
>> was supposed to be the minimal set of methods you can expect to be
>> present. I also like contracts to demand only what they need to 
>> demand,
>> and flat features make perfect sense to me. I believe interfaces have
>> been quite useful e.g. in the SearchIO tree.
>
> It's quite possible you're correct. However, I personally find myself
> getting tied in horrendous mental knots trying to figure out which
> contracts are meant to be enforced when.
>
> I'm also fairly sure that contract violation is rife.

Well if there are people who run a red light that doesn't necessarily 
indicate that we should abandon all traffic lights to make navigation 
of intersections more straightforward.

>
>> And yes, when I want to be sure in my code that a feature has the
>> ability to return nested features I ask whether it is a FeatureHolderI
>> and throw an exception otherwise. Doing so costs me exactly one line 
>> of
>> code. Not too bad in my personal view.
>
> Ok, I see. So there is a test, and an implicit cast going on.
>
> It still leaves me uneasy. There's some weird kind of guesswork going 
> on.
> If someone switches the underlying implementation from
> Bio::SeqFeature::Generic to Bio::SeqFeature::Foo in the SeqIO builders,
> and Foo does not implement FeatureHolderI, then it has side effects in
> other parts of my code. This is completely contrary to modular software
> design.

Well, I guess the question is do you permit SeqFeatureI implementations 
that cannot have nested features or do you not. If you generally do but 
they are not permitted by your code then the side effect is you throw 
an exception.

The FeatureHolderI contract is actually simple enough so that it's no 
big deal to just demand that for any feature implememtation (it'd be 
easy to implement for an always empty array of subfeatures). So if it 
unties some knots for you I'm fine with you going ahead and make 
SeqFeatureI inherit from FeatureHolderI.


>  Surely it is this sort of spooky action at a distance that OO
> claims to eradicate?
>
> In fact, taking your logic to the extreme, why don't all methods just
> advertise themselves as taking in and returning Bio::Root::RootI 
> objects?
> It's just one line of code to say
>
>   if ($obj->isa("Bio::SeqFeatureI")) {    ...    }
>
> This is an extreme example, but the point is there shouldn't be weird
> guesswork involved. Yes, it's one line to test, but how do I know if my
> features will *ever* conform to Bio::FeatureHolderI? Well I know 
> because I
> peeked under the covers and saw that Bio::SeqFeature::Generic objects 
> are
> created. But this can't be the correct way of doing things? Can it?
>
> (Actually, I suspect it is the way a lot of people use bioperl...)

Because of the lack of better documentation. Which clearly is a fact we 
have to accept is going to stay, realistically speaking. So you have a 
point here.

>
>> Why don't you make a proposal, state how much that's going to destroy
>> backward compatibility, list what we would gain from it in exchange,
>> identify who's going to refactor the codebase accordingly, and then we
>> vote on it.
>
> Shouldn't backwards compatibility be unaffected since interfaces are 
> just
> semi-formalized natural language documentation? (Except for "decorator"
> interfaces but that's an entirely different rant). I guess it could 
> affect
> areas where people test for interfaces using isa().

That's what I meant. I do this all the time, like it or not.

>
> To be honest, I'm not really up to this. I don't have a clue where to
> start refactoring. It gives me a headache to think about. I'm just a
> simple man in search of a few answers.

Chris many people out there are like you. What my point is that Bioperl 
has the shape that contributors give it. If you don't like something 
about its design don't expect those people who wrote it that way 
because they liked that way to instead write it the way you like it.

A rant is a good start, but to change the root cause requires committed 
volunteers to steer the best course.

>
>> Otherwise I'm having difficulty seeing the point.
>
> Probably not many people do

Quite possibly. I'm going to bite my tongue though about the 
constructiveness of this comment.

	-hilmar

>
>> 	-hilmar
>>
>>> -----Original Message-----
>>> From: Chris Mungall [mailto:cjm at fruitfly.org]
>>> Sent: Tuesday, June 17, 2003 11:29 AM
>>> To: bioperl-l at bioperl.org
>>> Subject: [Bioperl-l] Bio::FeatureHolderI interface confusion
>>>
>>>
>>>
>>> I'm a little confused by the proliferation of interfaces in
>>> bioperl. I've been tying my head in knots trying to figure
>>> out Bio::FeatureHolderI all morning, and I'm pretty sure
>>> there is something seriously wrong.
>>>
>>>> From my knowledge of the guts of bioperl, I know that any features
>>> generated by one of the SeqIO::* classes will be
>>> Bio::SeqFeature::Generic objects, and they will therefore
>>> implement Bio::FeatureHolderI
>>>
>>> However, I'm not sure how that knowledge would be accessible
>>> to the non-initatiated.
>>>
>>> - Bio::SeqIO states that SeqI compliant objects are created
>>>
>>> - the Bio::SeqI docs hints that Bio::SeqI may itself be a
>>> Bio::FeatureHolderI, and it therefore implements the method
>>> get_SeqFeatures()
>>>
>>> - However, both Bio::FeatureHolderI and Bio::SeqI state that
>>> get_SeqFeatures() returns a list of Bio::SeqFeatureI objects;
>>> Bio::SeqFeatureI is NOT a Bio::FeatureHolderI. (This of
>>> course means that Bio::FeatureHolderI is NOT necessarily recursive)
>>>
>>> So it seems that there is no guarantee that anything returned
>>> from Bio::SeqIO will implement Bio::FeatureHolderI. However,
>>> I am writing code that assumes the features obtained from a
>>> SeqIO class is a Bio::SeqFeature::Generic and hence
>>> implements Bio::FeatureHolderI. Is this bad?
>>>
>>> If it is bad, how is it possible to get hold of a feature
>>> that implements Bio::FeatureHolderI? Other than rolling my own?
>>>
>>> It seems that the easiest short term solution is to force
>>> Bio::SeqFeatureI to implement Bio::FeatureHolderI
>>>
>>> The 'correct' solution within the context of post-GOF OO
>>> design is to create a new interface
>>> Bio::SeqFeatureThatIsAFeatureHolderI which inherits from both
>>> Bio::SeqFeatureI and Bio::FeatureHolderI.
>>>
>>> Bio::FeatureHolderI->get_SeqFeatures() would then return
>>> instances of this new combined interface. (After all,
>>> Bio::FeatureHolderI isn't much use unless it is recursive).
>>>
>>> However, I think this 'correct' OO solution is just patching
>>> insanity with insanity.
>>>
>>> Personally, I would much rather see the swarm of interfaces
>>> and objects stopped and rolled back.
>>>
>>> In fact, I can't really see why we can't make do with just
>>> Bio::SeqI and Bio::SeqFeatureI
>>>
>>> (or even one class to represent both of these...)
>>>
>>> Why not force every Bio::SeqFeatureI class to implement
>>> get_SeqFeatures()? That seems to be the 'implicit' cryptic
>>> inheritance structure we have now.
>>>
>>> What is the gain in proliferating interfaces?
>>>
>>> One argument is that we might want 'lightweight' seqs or
>>> features. However, this is a false argument since the memory
>>> footprint can be made the same. I cannot think of any
>>> software engineering arguments beyond OO dogma in favour of
>>> proliferating interfaces.
>>>
>>> Perhaps if we were programming in a java straitjacket this
>>> would provide us with some level of compile time checking.
>>> However, java methodology certainly does not translate to
>>> perl. Most of bioperl would not compile if ported directly to
>>> java (for reasons stated above - e.g., accessing
>>> FeatureHolderI methods when not in a FeatureHolderI compliant
>>> object). It seems perverse to mimic this whole compile time
>>> checking infrastructure when no compile time checking is performed.
>>>
>>> I think it also makes things easier for new users if the
>>> number of classes/concepts they have to deal with is kept to
>>> a minimum.
>>>
>>>
>>> --
>>> Chris Mungall
>>> cjm at fruitfly.org
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-> bio.org/mailman/listinfo/bioperl-l
>>>
>>
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the Bioperl-l mailing list