[BioPython] GenBank parser used to break recently on rRNA records

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Fri Jul 27 13:42:56 UTC 2007


Hi,
  I tried to parse all ESTs and cDNAs from GenBank using biopython about
3 weeks old from CVS and it turned out it choked here:

Will parse file 'ftp://ftp.ncbi.nlm.nih.gov/genbank/gbhtc12.seq.gz'
Traceback (most recent call last):
  File "translate_ESTs.py", line 27, in ?
    _record = _iterator.next()
  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 142, in next
    return self._parser.parse(self.handle)
  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 208, in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 360, in feed
    self._feed_first_line(consumer, self.line)
  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 820, in _feed_first_line
    assert line[47:54].strip() in ['','DNA','RNA','tRNA','mRNA','uRNA','snRNA','cDNA'], \
AssertionError: LOCUS line does not contain valid sequence type (DNA, RNA, ...):
LOCUS       DQ369798                 725 bp    rRNA    linear   HTC 14-JUN-2007

  However, the code has been revamped as I see in current CVS, so this is
just for your information. I can parse the file with current code. ;-)
Martin



More information about the Biopython mailing list