[BioPython] Genbank parsing problem and fix

Gemma Atkinson gca500 at york.ac.uk
Tue Oct 3 11:36:58 UTC 2006


Hi Peter,

I was using the Bio.Genbank module. This is the code I've been using:

from Bio import GenBank
parser = GenBank.RecordParser(debug_level=2)
record = parser.parse(open("test4.txt"))

It was the expressions/genbank.py file, imported from within the  
Genbank module that I've been changing. I haven't touched the  
formatdefs/genbank.py file (should have made that clear before - sorry).

This was the error I was getting before I changed expressions/ 
genbank.py:

File "testgbparser.py", line 3, in ?
     record = parser.parse(open("test4.txt"))
   File "/Library/Frameworks/Python.framework/Versions/2.4/lib/ 
python2.4/Bio/GenBank/__init__.py", line 240, in parse
     self._scanner.feed(handle, self._consumer)
   File "/Library/Frameworks/Python.framework/Versions/2.4/lib/ 
python2.4/Bio/GenBank/__init__.py", line 1259, in feed
     self._parser.parseFile(handle)
   File "/Library/Frameworks/Python.framework/Versions/2.4/lib/ 
python2.4/Martel/Parser.py", line 328, in parseFile
     self.parseString(fileobj.read())
   File "/Library/Frameworks/Python.framework/Versions/2.4/lib/ 
python2.4/Martel/Parser.py", line 356, in parseString
     self._err_handler.fatalError(result)
   File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ 
python2.4/xml/sax/handler.py", line 38, in fatalError
     raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond  
character 1153


Gemma

On 3 Oct 2006, at 10:54, Peter wrote:

> gca500 at york.ac.uk wrote:
>> Hi All,
>> Been having a problem using the Genbank RecordParser with some  
>> Genbank files that have recently been added to NCBI. After a bit  
>> of trial and error, I realised the problem only occurs if a  
>> REFERENCE field isn't followed by an AUTHOR field (for example in  
>> reference 2 of this record: http://www.ncbi.nlm.nih.gov/entrez/ 
>> viewer.fcgi?db=protein&val=88602864).
>> There's a very easy fix on line 289 of Genbank.py. Decided to post  
>> this to the list to save any one else who stumbles across this  
>> problem tearing their hair out like I've been doing this afternoon!
>> Change ... and it works!
>> Hope this is useful,
>> Gemma
>
> Hi Gemma,
>
> I have made your suggested change to biopython/Bio/formatdefs/ 
> genbank.py as CVS revision 1.10, which should be viewable online soon:
>
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/ 
> expressions/genbank.py?cvsroot=biopython
>
> I am curious as to why you are using this code (part of the  
> FormatIO system), rather than the Bio.GenBank module.
>
> Thank you,
>
> Peter
>




More information about the Biopython mailing list