[Biopython-dev] [Biopython - Bug #3197] SeqIO parse error with some genbank files
redmine at redmine.open-bio.org
redmine at redmine.open-bio.org
Mon Jul 11 09:44:25 UTC 2011
Issue #3197 has been updated by Peter Cock.
Two more examples from the NCBI Bacteria FTP site, reported by Tim te Beek on our mailing list:
http://lists.open-bio.org/pipermail/biopython-dev/2011-July/009018.html
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Desulfurococcus_kamchatkensis_1221n_uid59133/NC_011766.gbk
LOCUS NC_011766 1365223 bp DNA circular BCT 20-MAY-2011
DEFINITION Desulfurococcus kamchatkensis 1221n chromosome, complete genome.
ACCESSION NC_011766
VERSION NC_011766.1 GI:218883314
DBLINK Project: 59133
KEYWORDS .
SOURCE Desulfurococcus kamchatkensis 1221n
...
misc_feature complement(order(1078481..1078483,join(1078778,
1078800..1078810)))
/locus_tag="DKAM_1147"
/note="active site"
/db_xref="CDD:73252"
http://lists.open-bio.org/pipermail/biopython-dev/2011-July/009019.html
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Saccharopolyspora_erythraea_NRRL_2338_uid62947/NC_009142.gbk
LOCUS NC_009142 8212805 bp DNA circular BCT 14-FEB-2011
DEFINITION Saccharopolyspora erythraea NRRL 2338 chromosome, complete genome.
ACCESSION NC_009142
VERSION NC_009142.1 GI:134096620
DBLINK Project: 62947
KEYWORDS complete genome.
SOURCE Saccharopolyspora erythraea NRRL 2338
...
misc_feature order(2409324..2409326,2409399..2409401,2409528..2409533,
2409619..2409624,2409679..2409681,2409748..2409753,
2409754..2409759,2409835..2409837,join(2409886..2409890,
2409892..2409898),2409911..2409913,2409920..2409925)
/locus_tag="SACE_2218"
/note="active site"
/db_xref="CDD:119408"
misc_feature order(2409324..2409326,2409399..2409401,2409528..2409530)
/locus_tag="SACE_2218"
/note="catalytic tetrad; other site"
/db_xref="CDD:119408"
----------------------------------------
Bug #3197: SeqIO parse error with some genbank files
https://redmine.open-bio.org/issues/3197
Author: Cedar McKay
Status: Resolved
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: 1.56
URL:
I've found a file that seems to choke SeqIO genbank parsing. I downloaded this file straight from NCBI, so it should be a good file. I've found a couple of other files that do the same thing. I reproduced this bug on another machine, also with biopython 1.56. I am able to successfully parse other genbank files. Maybe it has something to do with that very long location? Please let me know if I can provide any other information!
Thanks!
Cedar
>>> from Bio import SeqIO
>>> record = SeqIO.read('./Acorus_americanus_NC_010093.gb', 'genbank')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/SeqIO/__init__.py", line 597, in read
first = iterator.next()
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/SeqIO/__init__.py", line 525, in parse
for r in i:
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/GenBank/Scanner.py", line 437, in parse_records
record = self.parse(handle, do_features)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/GenBank/Scanner.py", line 420, in parse
if self.feed(handle, consumer, do_features):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/GenBank/Scanner.py", line 392, in feed
self._feed_feature_table(consumer, self.parse_features(skip=False))
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/GenBank/Scanner.py", line 344, in _feed_feature_table
consumer.location(location_string)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/Bio/GenBank/__init__.py", line 975, in location
raise LocationParserError(location_line)
Bio.GenBank.LocationParserError: order(join(42724..42726,43455..43457),43464..43469,43476..43481,43557..43562,43569..43574,43578..43583,43677..43682,44434..44439)
--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org
More information about the Biopython-dev
mailing list