[Biocorba-l] SeqFeatureLocation
Jason Stajich
jason@chg.mc.duke.edu
Wed, 7 Feb 2001 17:18:59 -0500 (EST)
Okay, so I've been told that
<5..100 and 5<..100 mean the same thing. I feel better about that.
But I'm still not clear how the location model will handle
(5.10) or (1^3) as locations. Are they really valid locations. We can
fudge it by making the start position be the value (since it can be
represented that way) and make it so there is no ending position. Sort of
circumvents the model though.
So we don't have a part for a fuzzy location only fuzzy endpoints. What
if the whole location is fuzzy, ie it is within 5.10 but we're not sure
where it starts or ends. To make this work we'd need to add a fuzzy field
to the SeqFeatureLocatoin struct.
Thanks.
On Tue, 6 Feb 2001, Jason Stajich wrote:
> Alan - I've finally gotten a chance to think about the location model some
> more..
>
> I like the SeqFeatureLocation has 2 SeqFeaturePositions for start end, and
> SeqFeaturePosition can handle all the codes. But I want to be sure that
> the following cases can really be handled by this. Correct me if any of
> these are wrong. I think we might need another field to represent whether
> or not the fuzzy code that is BEFORE or AFTER is also on the 3' or 5'
> strand... Or we change the code to be BEFORE-3' BEFORE-5', AFTER-3',
> AFTER-5'. Yuck, I know...
>
> 1..100 -- Location has 2 position objects for start and end fuzzy code is
> 'EXACT', extension = 0
> (1.2)..30 -- start is a position with fuzzy code 'WITHIN', position = 1,
> extension = 2
> 1^3 -- is this a legal location? If it is , how do we represent it?
> <20..40 -- (feature starts before bp 20 on 5' strand), position=20
> extension=0, fuzzy='BEFORE'
> >20..100 -- (feature starts after bp 20 on 5' strand) fuzzy='AFTER'
> 20<..100 -- (feature starts before bp 20 on 3' strand) fuzzy='BEFORE'
> 20>..100 -- (feature starts after bp 20 on 3' strand) fuzzy='AFTER'
>
> see
> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html#FeaturesB
> for explaination
>
> "The location of each feature is provided as well, an can be a single
> base, a contiguous span of bases, a joining of sequence spans, and
> other representations. If a feature is located on the complementary
> strand, the word "complement" will appear before the base span. If the
> "<" symbol precedes a base span, the sequence is partial on the 5' end
> (e.g., CDS <1..206). If the ">" symbol follows a base span, the
> sequence is partial on the 3' end (e.g., CDS 435..915>)."
>
>
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center
> http://www.chg.duke.edu/
>
>
> _______________________________________________
> Biocorba-l mailing list
> Biocorba-l@biocorba.org
> http://www.biocorba.org/mailman/listinfo/biocorba-l
>
Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center
http://www.chg.duke.edu/