[BioPython] GenBank parsing errors

Michael Maibaum mike at maibaum.org
Tue Nov 2 12:56:38 EST 2004


Hi,

I'm trying to use biopython to parse genbank files and it is working  
happily on some genbank files,  but not many others. So far the pattern  
appears to be

Prokaryotic complete genome => OK
Eukaryotic complete genome =>failure.

The failures are typically very early in the file and don't have  
wonderfully useful information in the traceback. It falls over in the  
Martel Parser giving the error


Martel.Parser.ParserPositionException: error parsing at or beyond  
character 191. As this genome is a bit large to attatch I've just  
included the +/- 10 lines around 191

The full file, should you want it is at:
<ftp://ftp.ensembl.org/pub/current_tetraodon/data/flatfiles/genbank/ 
Tetraodon_nigroviridis.0.dat.gz>

Does anyone have any ideas why this is failing, is it just the joy of  
tracking NCBI record formats and I need to start looking at the  
internals for a fix (or use something else) or?

thanks

Michael

-- 
Dr Michael Maibaum
Department of Biochemistry and Molecular Biology, UCL
email: maibaum at biochemistry.ucl.ac.uk



More information about the BioPython mailing list