[Biopython-dev] Notification: incoming/43
biopython-bugs at bioperl.org
biopython-bugs at bioperl.org
Thu Sep 27 05:22:02 EDT 2001
JitterBug notification
new message incoming/43
Message summary for PR#43
From: mkersz at pasteur.fr
Subject: GenBank parser fails (on large files?)
Date: Thu, 27 Sep 2001 05:22:01 -0400
0 replies 0 followups
====> ORIGINAL MESSAGE FOLLOWS <====
>From mkersz at pasteur.fr Thu Sep 27 05:22:02 2001
Received: from localhost (localhost [127.0.0.1])
by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f8R9M1p18288
for <biopython-bugs at pw600a.bioperl.org>; Thu, 27 Sep 2001 05:22:01 -0400
Date: Thu, 27 Sep 2001 05:22:01 -0400
Message-Id: <200109270922.f8R9M1p18288 at pw600a.bioperl.org>
From: mkersz at pasteur.fr
To: biopython-bugs at bioperl.org
Subject: GenBank parser fails (on large files?)
Full_Name: Michel Kerszberg
Module: GenBank
Version: 1.00a3
OS: linux 2.2
Submission from: cache.pasteur.fr (157.99.64.13)
fetch
ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Mycobacterium_tuberculosis_H37Rv/AL123456.gbk
open this with
file_handle = open( ... ,'r')
pars = GenBank.FeatureParser()
iter = GenBank.Iterator(file_handle, pars)
rec = iter.next()
This fails with:
rec = iter.next()
File "/usr/lib/python2.0/site-packages/Bio/GenBank/__init__.py", line 182, in
next
return self._parser.parse(File.StringHandle(data))
File "/usr/lib/python2.0/site-packages/Bio/GenBank/__init__.py", line 260, in
parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.0/site-packages/Bio/GenBank/__init__.py", line 1108, in
feed
self._parser.parseFile(handle)
File "/usr/lib/python2.0/site-packages/Martel/Parser.py", line 205, in
parseFile
self.parseString(fileobj.read())
File "/usr/lib/python2.0/site-packages/Martel/Parser.py", line 233, in
parseString
self._err_handler.fatalError(result)
File "/var/tmp/python-root//usr/lib/python2.0/xml/sax/handler.py", line 38, in
fatalError
Martel.Parser.ParserPositionException: error parsing at or beyond character 42
This is in the first line of the record, which seems
correctly formatted. No amount of massaging of the
file seems to help.
I have seen this problem reported with other large
GenBank records.
More information about the Biopython-dev
mailing list