[BioPython] GenBank records again

JINLING HUANG jinling at cs.uga.edu
Wed Feb 26 22:21:48 EST 2003


Jeff and all,

Thank you very much for the GenBank record things. Now I am trying to
retrieve protein sequences with a file of GenBank ids. My script is the following:

from Bio import GenBank
import sys

file = sys.argv[1]
fp1 = open(file, 'r+')    #file of gi
ids = fp1.read()

lids = ids.split()
recNum = len(lids)

protein_ncbi_dict = GenBank.NCBIDictionary(database='protein',
                        format='gp', parser=GenBank.FeatureParser())

for i in range(0, recNum):
    gb_record = protein_ncbi_dict[lids[i]]
    print '>'+ gb_record.id[0:-2] + '   ' + gb_record.seq.data

The script works well most of the time, but sometimes it gives an error
message:

Traceback (most recent call last):
  File "getGBRecords.py", line 25, in ?
    gb_record = protein_ncbi_dict[lids[i]]
File "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
1563, in __getitem__ return self.parser.parse(handle)
  File "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
268, in parse self._scanner.feed(handle, self._consumer)
  File "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
1255, in feed self._parser.parseFile(handle)
  File "/bio/python2.2/lib/python2.2/site-packages/Martel/Parser.py", line
338, in parseFile self.parseString(fileobj.read())
  File "/bio/python2.2/lib/python2.2/site-packages/Martel/Parser.py", line
366, in parseString self._err_handler.fatalError(result)
  File "/bio/python2.2/lib/python2.2/xml/sax/handler.py", line 38, in
fatalError raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond character 14


What is the reason for the problem? It seems that the problem is in the
parser part, but I just don't know why.  Can anybody help?

Best wishes,

Jinling



More information about the BioPython mailing list