[Bioperl-l] split seq feature and fuzzy feature proposal

Jason Stajich jason@chg.mc.duke.edu
Thu, 18 Jan 2001 14:41:51 -0500 (EST)

On Thu, 18 Jan 2001, Hilmar Lapp wrote:

> Jason Stajich wrote:
> > 
> > http://www.bioperl.org/wiki/html/BioPerl/AdvancedSeqFeatureLocations.html
> > 
> > Please look it over, I didn't describe the detail of the fuzzy feature
> > methods because I'm not sure there will be extra methods, just overriding
> > things like start,end to be remapped.  The different feature types need to
> > be differentiated so that Bio::SeqIO::FTHelper can handle then differently
> > when parsing/writing.
> > 
> > Ewan, Let me know what I've left off.  Hilmar does this sound reasonable,
> > straightforward enough to you?
> > 
> You didn't include actual interface definitions, did you? Just
> wondering whether I missed the link.

No - didn't describe actual interfaces since we are still struggling
through this.  Will do that when we agree enough.

> As mentioned before, what bothers me is that in this layout
> location-specific issues impact the class (type) of a SeqFeature.
> Why should any SeqFeature change it's type only because its
> location becomes uncertain or compound, and vice-versa?

Ewan and I had decoupled the LocationI from SeqFeature but there was no
seen advantage, just interface mish-mash, perhaps we were too hasty?

What you suggest above could be done as:

Bio::SeqFeatureI ISA RangeI

method : location 
desc   : Get/Set method
args   : LocationI object
returns: LocationI object

method : start()
desc   : start location of seqfeature

sub start {
	my($self) = @_;
	return $self->location->start()

... similar for end ...

Bio::LocationI ISA RangeI

Bio::SplitLocationI ISA Bio::LocationI

method: sub_SeqFeatures()
desc  : method for obtaining list of sub Locations - they could be
        SeqFeature::Exons, SeqFeature::Generic, or LocationI's?
returns: list of LocationI or SeqFeatureI objects?

Bio::FuzzyLocationI ISA Bio::LocationI

method: get_embl_fuzzy_string()
desc  : possible method to return location as an embl string for a fuzzy
returns: string

Does this seem more agreeable - location is decoupled from SeqFeature, but
we have to support backwards compatibility with SeqFeatureI ISA RangeI
which means all SeqFeatures have a start/end... 

> I'd rather favor uncoupling a feature and its location, with
> features having a reference to a location object which will give
> further detailsif the application worries. An application that
> doesn't do anything with the coordinates wouldn't notice a change,
> but an application that e.g. draws features on sequences will have
> to decide what to do if the location object says that the
> coordinates are not well determined. Retrieving the sequence part
> the feature refers to on its attached seq will also be affected:
> doing so for a feature with an uncertain location will result in
> an exception being thrown. Separating SeqFeatureI and LocationI
> allows also for the following: assume a feature with uncertain
> start and end. If you're satisfied with an average start and end,
> you can substitute the location object by a Range with certain
> start and end, and voila - drawing, sequence excision etc will
> just work fine on the very same feature object.
> Maybe I'm missing something.
> 	Hilmar
> -- 
> -----------------------------------------------------------------
> Hilmar Lapp                                email: hlapp@gmx.net
> GNF, San Diego, Ca. 92122                  phone: +1 858 812 1757
> -----------------------------------------------------------------

Jason Stajich
Center for Human Genetics
Duke University Medical Center