[BioPython] BLAST parser error with local web blast of multiple queries

Mike Cariaso MCariaso at Endogenybio.com
Tue Jun 17 11:51:43 EDT 2003


Blast output that seems to choke the parser is attached.

Error message is:
Traceback (most recent call last):
  File "./blastscores.py", line 13, in ?
    b_record = b_iterator.next()
  File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line  367, in next
    return self._parser.parse(File.StringHandle(data))
  File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 47,
in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 98,
in feed
    self._scan_header(uhandle, consumer)
  File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line
148, in _scan_header
    read_and_call_until(uhandle, consumer.reference, start='<p>')
  File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
366, in read_and_call_until
    line = safe_readline(uhandle)
  File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
442, in safe_readline
    raise SyntaxError, "Unexpected end of stream."
SyntaxError: Unexpected end of stream.


The attached HTMLized blast output was produced by NCBI's wwwblast
available from ftp://ftp.ncbi.nih.gov/blast/server/current_release


My problem seems to fit in the gaps between several of the tutorial
examples, so this may be a problem with my code, or the blast parser.


Here is example trimmed down code:
#!/usr/bin/env python

import sys
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIStandalone

if __name__ == '__main__':
    blast_results = open(sys.argv[1])
    b_parser = NCBIWWW.BlastParser()
    b_iterator = NCBIStandalone.Iterator(blast_results, b_parser)
    while 1:
        b_record = b_iterator.next()
        if b_record is None: break

        for alignment in b_record.alignments:
            for hsp in alignment.hsps:
                print '\t'.join([alignment.title,
                                 alignment.length,
                                 hsp.expect
                                 ])


My thinking has been along these lines. 

 - My blast output has been htmlized, by NCBIs tool - So I think I need
NCBIWWW's parser.

 - There are multiple sequences in my fasta input - So I need an
iterator.

 - NCBIWWW doesn't seem to provide an iterator, so I'm hoping to use
NCBIStandalone's iterator. This assumption is suspect. But I don't yet
know the biopython code base well enough to know a better alternative.

Any help is greatly appreciated.

Michael Cariaso


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython/attachments/20030617/dfd5f49d/tyrkin2-0001.html


More information about the BioPython mailing list