[Bioperl-l] *major* error in genbank parser or am i just insane?

Jason Stajich jason@cgt.mc.duke.edu
Fri, 9 Aug 2002 13:54:58 -0400 (EDT)


To push this onto the subfeatures as you suggest is going to take a fair
amount of refactoring in the current parsing code but would probably be
the best idea.

No one has been brave enough to go in there and mess with things very
much.  A full refactor is okay with me but we sort of already did that
when we moved to locations/seqfeature separation.  We also have the
problem that we support hierarchical features AND hierarchical locations.
At the BoF we discussed describing coding conventions to insure that
people follow a convention that works.  I'm still unclear what the right
path ahead should be.


I'd suggest that 1st we derive a set of test cases which break the
expected semantics, put these in a new test file or as part of t/SeqIO.t
show that the parser currently does the wrong thing and then set about
trying to fix it.  This should also test that the expected DNA is returned
from all of these cases as well.  If we have a test system in place that
does this properly we'll have a much better time tracking down errors and
being consistent.

I think cases like:

complement(join(1..200,205..300),complement(500..600))
join(complement(1..200),205..300,complement(500..600))

need to be properly tested



On Thu, 8 Aug 2002, Hilmar Lapp wrote:

>
>
> > -----Original Message-----
> > From: Ewan Birney [mailto:birney@ebi.ac.uk]
> > Sent: Thursday, August 08, 2002 8:30 AM
> > To: Chris Mungall
> > Cc: Hilmar Lapp; Elia Stupka; Jason Stajich; bioperl-l@bioperl.org
> > Subject: Re: [Bioperl-l] *major* error in genbank parser or am i just
> > insane?
> >
> [...]
> >
> > I do prefer chris' semantics to having to hold onto the
> > difference between
> > a parent complement and a child complement - ie, I think we should
> > implicitly only allow the complement to happen on simple sequence
> > locations and never splits, and genbank with an outer complement is an
> > implicit distributive complement and reverse of its components.
> >
>
> OK. So this is a vote for sublocs on strand -1 and splitloc strandless, right?
>
> Even though this differs from the present implementation, I actually think too this is saner. So my vote goes here too. Jason? (I know we decided otherwise in Canada :o)
>
> 	-hilmar
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu