[BioPython] Genbank parsing problem and fix
Gemma Atkinson
gca500 at york.ac.uk
Tue Oct 3 11:36:58 UTC 2006
Hi Peter,
I was using the Bio.Genbank module. This is the code I've been using:
from Bio import GenBank
parser = GenBank.RecordParser(debug_level=2)
record = parser.parse(open("test4.txt"))
It was the expressions/genbank.py file, imported from within the
Genbank module that I've been changing. I haven't touched the
formatdefs/genbank.py file (should have made that clear before - sorry).
This was the error I was getting before I changed expressions/
genbank.py:
File "testgbparser.py", line 3, in ?
record = parser.parse(open("test4.txt"))
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/Bio/GenBank/__init__.py", line 240, in parse
self._scanner.feed(handle, self._consumer)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/Bio/GenBank/__init__.py", line 1259, in feed
self._parser.parseFile(handle)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/Martel/Parser.py", line 328, in parseFile
self.parseString(fileobj.read())
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/
python2.4/Martel/Parser.py", line 356, in parseString
self._err_handler.fatalError(result)
File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/xml/sax/handler.py", line 38, in fatalError
raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond
character 1153
Gemma
On 3 Oct 2006, at 10:54, Peter wrote:
> gca500 at york.ac.uk wrote:
>> Hi All,
>> Been having a problem using the Genbank RecordParser with some
>> Genbank files that have recently been added to NCBI. After a bit
>> of trial and error, I realised the problem only occurs if a
>> REFERENCE field isn't followed by an AUTHOR field (for example in
>> reference 2 of this record: http://www.ncbi.nlm.nih.gov/entrez/
>> viewer.fcgi?db=protein&val=88602864).
>> There's a very easy fix on line 289 of Genbank.py. Decided to post
>> this to the list to save any one else who stumbles across this
>> problem tearing their hair out like I've been doing this afternoon!
>> Change ... and it works!
>> Hope this is useful,
>> Gemma
>
> Hi Gemma,
>
> I have made your suggested change to biopython/Bio/formatdefs/
> genbank.py as CVS revision 1.10, which should be viewable online soon:
>
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/
> expressions/genbank.py?cvsroot=biopython
>
> I am curious as to why you are using this code (part of the
> FormatIO system), rather than the Bio.GenBank module.
>
> Thank you,
>
> Peter
>
More information about the Biopython
mailing list