[Bioperl-l] *major* error in genbank parser or am i just insane?

Ewan Birney birney@ebi.ac.uk
Wed, 7 Aug 2002 08:54:20 +0100 (BST)


On Tue, 6 Aug 2002, Chris Mungall wrote:

> 
> bear in mind the situation I'm talking about is not for weirdo isolated
> transplicing cases or 20 year old records entered by some crazed lunatic -
> this bug happens if you go to ncbi download the human assembly build29 and
> parse it: half of the mRNAs are wrong, unless i'm doing something
> fundamentally wrong

I agree that it seems weird that no one else noticed this bug, but Chris- 
if this fixes it, go for it... 


> 
> we need a good long term solution that is robust for all of genbank, but
> we need a short term fix for the standard situation even more - shall i
> commit my chnange or will this mess things up more?
> 
> all it is
> 
> $location->strand($strand)
> 
> in FTHelper.pm
> 
> On Wed, 7 Aug 2002, Elia Stupka wrote:
> 
> > > I doubt that the cross-product of location types and genbank entries
> > > has been tested in its entirety, so something may have easily
> > > escaped.
> >
> > Definitely not. Long time ago when I had written the Genbank parser I had
> > done a few file-parser-file cycles to see that the files spit out would be
> > identical.
> >
> > As we all know public databases will never cease to amaze us, so the only
> > way to be bug-free on this would be to parse all of Genbank on a regular
> > basis, spit it out again in Genbank format and write a log of the diff and
> > look into that diff.
> >
> > Sorry for dropping the standard of the e-mail conversation from grammar
> > discussion to this idiotic paranoyed level of checking, but it is the
> > only way to do it.
> >
> > Since we keep public DBs in biosql and we will need more and more public
> > dbs in biosql, I can easily put a hook in to spit out the files again and
> > diff them, and on a monthly basis for example post the "findings" so we
> > can then all go off and fix bugs...
> >
> > Does it sound right?
> >
> > By the way, I posted a fix for missing GIs, has it landed on the bioperl
> > list? I assume all is fine, just checking.
> >
> > Elia
> >
> > ********************************
> > * http://www.fugu-sg.org/~elia *
> > * tel:    +65 6874 1467        *
> > * mobile: +65 9030 7613        *
> > * fax:    +65 6777 0402        *
> > ********************************
> >
> >
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------