[BioPython] GenBank parser used to break recently on rRNA records
Martin MOKREJŠ
mmokrejs at ribosome.natur.cuni.cz
Fri Jul 27 13:42:56 UTC 2007
Hi,
I tried to parse all ESTs and cDNAs from GenBank using biopython about
3 weeks old from CVS and it turned out it choked here:
Will parse file 'ftp://ftp.ncbi.nlm.nih.gov/genbank/gbhtc12.seq.gz'
Traceback (most recent call last):
File "translate_ESTs.py", line 27, in ?
_record = _iterator.next()
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 142, in next
return self._parser.parse(self.handle)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 208, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 360, in feed
self._feed_first_line(consumer, self.line)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 820, in _feed_first_line
assert line[47:54].strip() in ['','DNA','RNA','tRNA','mRNA','uRNA','snRNA','cDNA'], \
AssertionError: LOCUS line does not contain valid sequence type (DNA, RNA, ...):
LOCUS DQ369798 725 bp rRNA linear HTC 14-JUN-2007
However, the code has been revamped as I see in current CVS, so this is
just for your information. I can parse the file with current code. ;-)
Martin
More information about the Biopython
mailing list