[BioPython] BLAST parser error with local web blast of multiple
queries
Mike Cariaso
MCariaso at Endogenybio.com
Tue Jun 17 11:51:43 EDT 2003
Blast output that seems to choke the parser is attached.
Error message is:
Traceback (most recent call last):
File "./blastscores.py", line 13, in ?
b_record = b_iterator.next()
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
line 367, in next
return self._parser.parse(File.StringHandle(data))
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 47,
in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 98,
in feed
self._scan_header(uhandle, consumer)
File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line
148, in _scan_header
read_and_call_until(uhandle, consumer.reference, start='<p>')
File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
366, in read_and_call_until
line = safe_readline(uhandle)
File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
442, in safe_readline
raise SyntaxError, "Unexpected end of stream."
SyntaxError: Unexpected end of stream.
The attached HTMLized blast output was produced by NCBI's wwwblast
available from ftp://ftp.ncbi.nih.gov/blast/server/current_release
My problem seems to fit in the gaps between several of the tutorial
examples, so this may be a problem with my code, or the blast parser.
Here is example trimmed down code:
#!/usr/bin/env python
import sys
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIStandalone
if __name__ == '__main__':
blast_results = open(sys.argv[1])
b_parser = NCBIWWW.BlastParser()
b_iterator = NCBIStandalone.Iterator(blast_results, b_parser)
while 1:
b_record = b_iterator.next()
if b_record is None: break
for alignment in b_record.alignments:
for hsp in alignment.hsps:
print '\t'.join([alignment.title,
alignment.length,
hsp.expect
])
My thinking has been along these lines.
- My blast output has been htmlized, by NCBIs tool - So I think I need
NCBIWWW's parser.
- There are multiple sequences in my fasta input - So I need an
iterator.
- NCBIWWW doesn't seem to provide an iterator, so I'm hoping to use
NCBIStandalone's iterator. This assumption is suspect. But I don't yet
know the biopython code base well enough to know a better alternative.
Any help is greatly appreciated.
Michael Cariaso
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython/attachments/20030617/dfd5f49d/tyrkin2-0001.html
More information about the BioPython
mailing list