[Biopython-dev] EMBL flatfile parsing

Albert Krewinkel krewink at inb.uni-luebeck.de
Mon Apr 3 17:48:06 UTC 2006


I am trying to parse a EMBL-formated file with biopython, but I
couldn't find any working parser for this. When I try to use the
Martel-based parser as described in one of the mailinglist-threads, I
get the following error:

Python 2.4.1 (#1, Oct 22 2005, 16:20:11)
[GCC 4.0.0 20041026 (Apple Computer, Inc. build 4061)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> filename = '/Users/krewinkel/tmp/embltest.embl'
>>> from Bio.formatdefs.embl import embl65
>>> from xml.sax import saxutils
>>> parser = embl65.make_parser()
>>> parser.setContentHandler(saxutils.XMLGenerator())
>>> parser.parse(open(filename))
<?xml version="1.0" encoding="iso-8859-1"?>
<dataset format="embl/65">Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/opt/local/lib/python2.4/site-packages/Martel/Parser.py", line 482, in parse
    self.parseFile(source.getCharacterStream() or source.getByteStream())
  File "/opt/local/lib/python2.4/site-packages/Martel/Parser.py", line 468, in parseFile
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/xml/sax/handler.py", line 34, in error
    raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond character 0

The file itself appears to be okay, since it can be read by 'seqret'
and bioperl. This seems to be a parser problem -- or am I doing
something wrong?

Thanks in advance

Albert Krewinkel
University of Luebeck
phone: +49 (451) 500 5516
email: krewink at inb.uni-luebeck.de

More information about the Biopython-dev mailing list