[Bioperl-l] *major* error in genbank parser or am i just insane?

Elia Stupka elia@fugu-sg.org
Wed, 7 Aug 2002 10:22:29 +0800 (SGT)


> I doubt that the cross-product of location types and genbank entries 
> has been tested in its entirety, so something may have easily 
> escaped.

Definitely not. Long time ago when I had written the Genbank parser I had
done a few file-parser-file cycles to see that the files spit out would be
identical.

As we all know public databases will never cease to amaze us, so the only
way to be bug-free on this would be to parse all of Genbank on a regular
basis, spit it out again in Genbank format and write a log of the diff and
look into that diff.

Sorry for dropping the standard of the e-mail conversation from grammar
discussion to this idiotic paranoyed level of checking, but it is the
only way to do it.

Since we keep public DBs in biosql and we will need more and more public
dbs in biosql, I can easily put a hook in to spit out the files again and
diff them, and on a monthly basis for example post the "findings" so we
can then all go off and fix bugs...

Does it sound right?

By the way, I posted a fix for missing GIs, has it landed on the bioperl
list? I assume all is fine, just checking.

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6777 0402        *
********************************