[Biopython-dev] GenBank parser -- first go

Wed Dec 6 03:28:59 EST 2000

----- Original Message -----
From: "Jeffrey Chang" <jchang at SMI.Stanford.EDU>
To: "Brad Chapman" <chapmanb at arches.uga.edu>
Cc: <biopython-dev at biopython.org>
Sent: Monday, December 04, 2000 11:36 PM
Subject: Re: [Biopython-dev] GenBank parser -- first go

> Hi Brad,
>
> On Mon, 4 Dec 2000, Brad Chapman wrote:
>
> > Hello all;
> > As promised, I spent this weekend getting together a GenBank parser,
> > which I hope is something that we could include in Biopython in the
> > future. What I've got so far is available from:
>
   Does it strip html tags?  When I ran checkoutput.py, it produced this
output.

C:\gb_parser-20001204\Scripts>python check_output.py nutmeg.htm
Traceback (most recent call last):
  File "check_output.py", line 25, in ?
    iterator = GenBank.Iterator(handle, parser)
  File "c:\biopyt~1.90d\Bio\PGML\GenBank\GenBank.py", line 57, in __init__
    self._reader = RecordReader.StartsWith(handle, "LOCUS")
  File "c:\biopyt~1.90d\Martel\RecordReader.py", line 130, in __init__
    self.tagtable)
  File "c:\biopyt~1.90d\Martel\RecordReader.py", line 89, in
_find_begin_positio
ns
    raise ReaderError("invalid format starting with %s" % repr(text[:50]))
Martel.RecordReader.ReaderError: invalid format starting with '<!DOCTYPE
HTML PU

  The problem with conversions to text is that Netscape and Explorer and
probably others use different algorithms and produce different text output.

                                                                    Cayte