[Biocorba-l] SeqFeatureLocation

Jason Stajich jason@chg.mc.duke.edu
Thu, 8 Feb 2001 12:11:39 -0500 (EST)


I do follow you example below, I guess the only other case is how would
just the location 5.10 be represented?  Start and end are known, but the
whole location is fuzzy not the endpoints.  I have made this work in
bioperl by adding a location fuzzy code as well which can be EXACT,
WITHIN, BETWEEN.  

On Thu, 8 Feb 2001, Alan Robinson wrote:

> 
> On Wed, 7 Feb 2001, Jason Stajich wrote:
> 
> > Okay, so I've been told that <5..100 and 5<..100 mean the same thing.  
> > I feel better about that.
> 
> I'm relieved too since the EMBL FeatureTable specification
> [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html]
> makes no mention of the latter case and I've been digging around.
> 
> 
> > But I'm still not clear how the location model will handle
> > (5.10) or (1^3) as locations.   Are they really valid locations.  We can
> > fudge it by making the start position be the value (since it can be
> > represented that way) and make it so there is no ending position.  Sort of
> > circumvents the model though.   
> 
> These cases of fuzziness are looked after using a combination of 'start',
> 'extension' and 'fuzzy' variables for the start and end position. The IDL
> is modelled after (i.e. stolen from) the EMBL IDL which handles these
> type of occurences.
> 
> 
> For your (truely horrible) example location: (1^3)..(5.10) then the
> following in Perl would return all the infomation about location to you:
> 
> 
> # Return the single SeqFeatureLocation for this SeqFeature object
> my @location = @{$mySeqFeature->locations()};
> 
> 
> # First - do the starting position: 1^3
> 
> # Get the start position as a SeqFeaturePostion object:
> my $startSeqFeaturePosition = $location[0]->start;
> 
> # Get the first base - the value should be 1
> my $start = $startSeqFeaturePosition->start;
> 
> # Get the extension of this position - the value should be 2:
> my $extension = $startSeqFeaturePosition->extension;
> 
> # Get the type code for the fuzziness - should be 3 (i.e. BETWEEN
> # or '^' if you look this up in the FuzzyTypeCode interface):
> my $fuzzy = $startSeqFeaturePosition->fuzzy;
> 
> 
> # Now do the end position: 5.10
> 
> # Get the end position as a SeqFeaturePostion object:
> my $endSeqFeaturePosition = $location[0]->end;
> 
> # Get the first base - the value should be 5:
> $start = $endSeqFeaturePosition->start;
> 
> # Get the extension of this position - the value should be 5:
> $extension = $endSeqFeaturePosition->extension;
> 
> # Get the type code for the fuzziness - should be 2 (i.e. WITHIN
> # or '.' if you look this up in the FuzzyTypeCode interface):
> $fuzzy = $endSeqFeaturePosition->fuzzy;
> 
> 
> >From the above:
> 
> For the starting position:
> 
>   start = 1
>   extension = 2
>   fuzzy = '^' (BETWEEN)
>   -----------
>   (1^3)
> 
> For the end position:
> 
>   start = 5
>   extension = 5
>   fuzzy = '.' (WITHIN)
>   -----------
>   (5.10)
> 
> 
> Thus the final Location is (1^3)..(5.10).
> 
> 
> Do you follow this? Or is there a problem I've missed?
> 
> 
> 
> > So we don't have a part for a fuzzy location only fuzzy endpoints.  What
> > if the whole location is fuzzy, ie it is within 5.10 but we're not sure
> > where it starts or ends.  To make this work we'd need to add a fuzzy field
> > to the SeqFeatureLocatoin struct.  
> 
> 
> _______________________________________________
> Biocorba-l mailing list
> Biocorba-l@biocorba.org
> http://www.biocorba.org/mailman/listinfo/biocorba-l
> 

Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center 
http://www.chg.duke.edu/