[Bioperl-l] changes to GenBank (and other) parsing in 0.6.2?

Mark Wilkinson mwilkinson@gene.pbi.nrc.ca
Tue, 10 Oct 2000 09:04:56 -0600


> BTW if you there are things in your re-parsing that correct for syntactic
> errors (i.e., bugs) of the Bioperl parser, please let us know.

unfortunately not - my re-parser is pretty "dumb".    It throws away all
feature catagories that do not reliably report their strandedness correctly,
it considers $SeqObj->top_level features to be "genes" (or at least things
containing sub-features), and their sub-features to *not* be genes despite
what their tags say, and throws away the one big top-level feature which
represents the entire sequence of interest, as this isn't particularly
interesting ;-)

it doesn't do anything particlarly magical or interesting - it really is
just cleaning up the data for graphical display in an attempt to ensure that
you dont have the graphical widgets drawing over each other or being
duplicated.  (b.t.w. when I say "throw away" I mean only that it ignores
these things... they are still available in the $SeqObj.)

We are still waiting for the NRC to create a copyright statement for us and
then we will release it.  So far we have had some really helpful suggestions
for changes/enhancements from our NRC colleagues in Halifax, so it will be
great when y'all can muck about with it for a while and give us additional
feedback!

Cheers group!

M

--
---
Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK
Canada