[Bioperl-l] *major* error in genbank parser or am i just insane?

Hilmar Lapp hlapp@gnf.org
Wed, 7 Aug 2002 15:20:17 -0600


For the risk of becoming obnoxious, I reiterate that there are 2 
possibilities for genbank feature table locations to express reverse 
strand joins, and their semantics (how the seq is obtained for the 
feature) are /not/ the same. You can't collapse this all into one 
and expect genbank to be roundtripped. If you do collapse it, I'd 
double check that the semantics is preserved correctly.

The problem is that biosql is already out-of-sync here with bioperl 
as well as genbank: both parent /and/ sublocations have a strand 
attribute ...

Sorry, maybe I'm just paranoid ...

	-hilmar

On Wednesday, August 7, 2002, at 12:39  PM, Chris Mungall wrote:

>
> i would have though the sublocations strand should be -1, as they
> represent exons on the reverse strand. but i don't really 
> understand the
> whole bioperl location+seqfeature semantics/model; when outside the
> bioperl world i just have one class that rolls seqfeature and location
> into one.
>
> i'm happy to have hilmar revoke my fix and instead go with checking the
> parent location strand rather than the sublocation strand (if someone
> could fix the genbank dumper to print the complement correctly that 
> would
> be great). if we go this route i will fix bioperl-db so that the parent
> location strand goes into the seqfeature_location table. note that this
> will introduce a slight disjunction between biosql abnd bioperl (in 
> biosql
> we absolutely must represent -ve strand exons as
> seqfeature_location.strand = -1). hmm, how does biojava handle this.
>
> On Wed, 7 Aug 2002, Hilmar Lapp wrote:
>
>> After looking at Chris' fix, it appears to be wrong: it would set
>> the sublocs' strand to -1. The problem lies elsewhere, I'm going to
>> revoke that fix.
>>
>> 	-hilmar
>>
>> On Wednesday, August 7, 2002, at 10:10  AM, Hilmar Lapp wrote:
>>
>>> I have no idea what the present status on that is, but my reply was
>>> generally not about a long-term/high-level/design/it would
>>> be much better if/ discussion. I basically asked the question what
>>> complement(join(1..100,201..300)) exactly means, and whether it has
>>> been decided how exactly it shall be translated into strand()
>>> attributes of the parent and sub-locations. This hasn't been
>>> answered yet ...
>>>
>>> Quoting from the FT definition:
>>>
>>> complement(join(2691..4571,4918..5163))
>>>                           Joins regions 2691 to 4571 and 4918 to
>>> 5163, then 
>>>                           complements the joined segments (the
>>> feature is 
>>>                           on the strand complementary to the
>>> presented strand)
>>>  
>>> join(complement(4918..5163),complement(2691..4571))
>>>                           Complements regions 4918 to 5163 and 
>>> 2691
>>> to 4571, then 
>>>                           joins the complemented segments (the
>>> feature is 
>>>                           on the strand complementary to the
>>> presented strand)
>>>
>>> The case in question is the first example. To translate this
>>> properly to Bioperl locations, this means the parent SplitLoc is
>>> strand -1, whereas the subs are strand +1. Right?
>>>
>>> 	-hilmar
>>>
>>>
>>> On Tuesday, August 6, 2002, at 10:24  PM, Chris Mungall wrote:
>>>
>>>> ok, committed - it seems to have had some weird knock on effect
>>>> breaking
>>>> other tests - i can uncommit if this is bad
>>>>
>>>> On Wed, 7 Aug 2002, Elia Stupka wrote:
>>>>
>>>>>> we need a short term fix for the standard situation even more -
>>>>>> shall i
>>>>>> commit my chnange or will this mess things up more?
>>>>>
>>>>> Please commit it, I cannot stand when long-term/high-
>>>>> level/design/it would
>>>>> be much better if/ discussions get in the way of production
>>>>> improvement fixes.
>>>>>
>>>>> Once it's committed I can set off a script for the diffing of 
>>>>> in/out
>>>>> genbank so you can be comfortable that it's not screwing up the
>>>>> rest of
>>>>> genkank parsing ;)
>>>>>
>>>>> Elia
>>>>>
>>>>> ********************************
>>>>> * http://www.fugu-sg.org/~elia *
>>>>> * tel:    +65 6874 1467        *
>>>>> * mobile: +65 9030 7613        *
>>>>> * fax:    +65 6777 0402        *
>>>>> ********************************
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>> --
>>> -------------------------------------------------------------
>>> Hilmar Lapp                            email: lapp at gnf.org
>>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>>> -------------------------------------------------------------
>>>
>>>
>> --
>> -------------------------------------------------------------
>> Hilmar Lapp                            email: lapp at gnf.org
>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>> -------------------------------------------------------------
>>
>>
>
>
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------