[BioPython] GenBank Parser Errors (Repost)
Michael Maibaum
mike at maibaum.org
Mon Nov 8 05:28:29 EST 2004
Hi,
(I'm sorry if you get this twice, but I sent it to the list last week
and didn't get a reply so I'm hoping someone with a suggestion will see
it this time, thanks. )
I'm trying to use biopython to parse genbank files and it is working
happily on some genbank files, but not many others. So far the pattern
appears to be
Prokaryotic complete genome => OK
Eukaryotic complete genome =>failure.
The failures are typically very early in the file and don't have
wonderfully useful information in the traceback. It falls over in the
Martel Parser giving the error
Martel.Parser.ParserPositionException: error parsing at or beyond
character 191. As this genome is a bit large to attatch I've just
included the +/- 10 lines around 191
The full file, should you want it is at:
<ftp://ftp.ensembl.org/pub/current_tetraodon/data/flatfiles/genbank/
Tetraodon_nigroviridis.0.dat.gz>
Does anyone have any ideas why this is failing, is it just the joy of
tracking NCBI record formats and I need to start looking at the
internals for a fix (or use something else) or?
Is it worth trying biopython cvs?
Mac OS X 10.3.5
Python 2.3.4, up to date biopython
thanks
Michael
--
Dr Michael Maibaum
Department of Biochemistry and Molecular Biology, UCL
email: maibaum at biochemistry.ucl.ac.uk
More information about the BioPython
mailing list