[Bioperl-l] Re: No joins

Brian King brian.king@animorphics.net
Fri, 23 Aug 2002 11:10:02 -0700 (PDT)


Ewan,

Here's an example of representing a join with a fuzzy
start region using generic sub-features.  The location
descritor is

CDS             join(<2762..2959,3175..3319)

Each region gets its own generic sub-feature of type
"-".  The seq_location for the fuzzy region has
sub-elements called spans to hold the fuzzyness. 
Spans represent things like (20.30), <2762, or 23^24. 
The seq_location has a clean start,length that is
useful for visualization, and if you want to know the
fuzzy details for a text representation you can go
look in the spans.   And still no "joins" in the data
model!

AGAVE representation:

<seq_feature type="CDS">
  <seq_location start="2762" length="558" 
                orientation="+"/>
  <seq_feature type="-">
    <seq_location start="2762" length="198"    
                orientation="+">
      <start><span start="2762" length="1" 
                 exact="false" 
                 type="single"/></start>
       <end><span start="2959" length="1" exact="true"

                  type="single"/></end>
   </seq_location>
  </seq_feature>
  <seq_feature type="-">
    <seq_location start="3175" length="145" 
                  orientation="+"/>
  </seq_feature>
</seq_feature>

I documented all the EMBL-like location descriptor
cases at 

http://www.animorphics.net/agave/doc/location_representation.html.

You can't quite do round trip parsing because the
format doesn't distinguish between "join" and "order",
but that would be easy to do with another attribute.

Regards,
Brian

--- Brian King <kingb_98@yahoo.com> wrote:
> 
> >  but most fuzzies are
> > also joins, (in fact 
> > alot of joins have fuzzy ends) so... it became the
> > defacto way to handle 
> > joins.
> 
> If your location class supports fuzzy regions, then
> can't you move the fuzzy regions out of a join and
> into fuzzy locations of generic sub-features?  If
> it's
> the end regions that are fuzzy, then the parent has
> to
> retain the fuzzyness, but at least you get rid of
> the
> join.  AGAVE has a structure for fuzzy regions, so
> I'll look up a GenBank example and make an XML
> version
> again.  Probably next week.
> 
> 
> > Practical question - what does BioJava do with the
> > Fuzzies?
> > 
> 
> BioJava has a base Location class that has
> getMin()/getMax() methods to handle the simple case
> of
> a contiguous region.  The are sub-classes to handle
> the more complex locations, such as a FuzzyLocation.
> 
> FuzzyLocation takes outerMin, outerMax, innerMin,
> innerMax,  isMinFuzzy, isMaxFuzzy arguments in the
> constructor.  There is a default policy that tells
> how
> the fuzzy ends get interpreted in getMin()/getMax(),
> and the policy is settable.  So for any location you
> can deal with a simple range, or request all the
> gory
> detail.  It's complicated, but true to the input
> data.
> 
> Regards,
> Brian 
> 
> __________________________________________________
> Do You Yahoo!?
> HotJobs - Search Thousands of New Jobs
> http://www.hotjobs.com
> 


__________________________________________________
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com