[Bioperl-l] split seq feature and fuzzy feature proposal

Ewan Birney birney@ebi.ac.uk
Thu, 18 Jan 2001 23:37:24 +0000 (GMT)


On Thu, 18 Jan 2001, Hilmar Lapp wrote:

> 
> Note that as one of the few noticeable changes in the SeqFeatureI
> API this call should be allowed to throw an exception if
> 	1) the start location is uncertain
> 	2) the start location does not refer to the attached seq
> 	(to be disputed)

My feeling is that seqfeature->start should still be well defined. It is
up to the SeqFeature implementing class to "make a sensible
decision" about start/end points.


If it is fuzzy/complex/strange the client can test. If the client does not
want to test and just wants to "draw it", I think inisiting that
start/end/seqname return *something* is valid. Otherwise the client has
no real option to figure out what to do with these things...

If we let the implementaiton objects get away with not implementing this,
the interface becomes less useful...

</snip>

> annotations, would you just draw the part referring to the
> attached seq? Ensembl people, any experience/wishlists for this?

Experience on our side is that

90% of things are either SeqFeatures or FeaturePairs and fit the simple
seqfeature interface just fine

the remaining 10% are genes and could be handled via some sort of complex
location thing. As genes have transcripts have exons, simple mapping to
complex locations is not on. For other internal reasons, Ensembl is very
likely to keep with specialised adaptor classes which map Ensembl genes to
Bioperl SeqFeatures, so we are flexible here...


> 
> An obvious requirement is the ability to recover the original
> GenEmbl location string, so all the information necessary should
> be present.

Right. 

> 

</snip>

> 
> min_start()/max_start() etc should also be included. start() and
> end() in an implementation are overridden and throw exceptions,
> depending on which end is uncertain (and least they should be
> expected to throw exceptions). A certain end can be determined by
> min_start() == max_start() (or .._end(), resp.).

I would be in favour or min_start/max_start but against letting start
throw an exception. The implementation has to decide how to "become a hard
feature" from being Fuzzy. It is up to the implementation. As long as this
is documented, this is no more arbitary than letting the client decide.

> 
> > Does this seem more agreeable - location is decoupled from SeqFeature, but
> > we have to support backwards compatibility with SeqFeatureI ISA RangeI
> > which means all SeqFeatures have a start/end...
> > 
> 
> I indeed like the decoupled approach much better.
> 

If we go for a decoupled approach I am keen on it being justified by more
than just "it feels good". We are increasing the complexity here alot and
we need justification...


> 	Hilmar
> -- 
> -----------------------------------------------------------------
> Hilmar Lapp                                email: hlapp@gmx.net
> GNF, San Diego, Ca. 92122                  phone: +1 858 812 1757
> -----------------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------