[BioPython] BLAST parser error with local web blast of multiple queries

Jeffrey Chang jchang at jeffchang.com
Tue Jun 17 14:16:18 EDT 2003


Hi Mike,

Thanks for the BLAST report.  Yes, there are indeed changes in the WWW 
format.  I've updated the NCBIWWW parser to deal with them.  
Unfortunately, there is no iterator for NCBIWWW output, and it's not 
trivial to create one.

In general, though, the NCBIStandalone parser (which parses plain text 
output) is more heavily used and better tested.  I'd highly recommend 
using plain text format (choose Plain text in the web form).  We will 
slowly deprecate the support for HTML-ized blast reports in favor of 
this format.

Along the same lines, is anyone using the XML format?  There is no 
support for it in biopython, but perhaps there should be.

Jeff




On Tuesday, June 17, 2003, at 07:51  AM, Mike Cariaso wrote:

> Blast output that seems to choke the parser is attached.
>
> Error message is:
> Traceback (most recent call last):
>   File "./blastscores.py", line 13, in ?
>     b_record = b_iterator.next()
>   File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
> line  367, in next
>     return self._parser.parse(File.StringHandle(data))
>   File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 
> 47,
> in parse
>     self._scanner.feed(handle, self._consumer)
>   File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 
> 98,
> in feed
>     self._scan_header(uhandle, consumer)
>   File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line
> 148, in _scan_header
>     read_and_call_until(uhandle, consumer.reference, start='<p>')
>   File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
> 366, in read_and_call_until
>     line = safe_readline(uhandle)
>   File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
> 442, in safe_readline
>     raise SyntaxError, "Unexpected end of stream."
> SyntaxError: Unexpected end of stream.
>
>
> The attached HTMLized blast output was produced by NCBI's wwwblast
> available from ftp://ftp.ncbi.nih.gov/blast/server/current_release
>
>
> My problem seems to fit in the gaps between several of the tutorial
> examples, so this may be a problem with my code, or the blast parser.
>
>
> Here is example trimmed down code:
> #!/usr/bin/env python
>
> import sys
> from Bio.Blast import NCBIWWW
> from Bio.Blast import NCBIStandalone
>
> if __name__ == '__main__':
>     blast_results = open(sys.argv[1])
>     b_parser = NCBIWWW.BlastParser()
>     b_iterator = NCBIStandalone.Iterator(blast_results, b_parser)
>     while 1:
>         b_record = b_iterator.next()
>         if b_record is None: break
>
>         for alignment in b_record.alignments:
>             for hsp in alignment.hsps:
>                 print '\t'.join([alignment.title,
>                                  alignment.length,
>                                  hsp.expect
>                                  ])
>
>
> My thinking has been along these lines.
>
>  - My blast output has been htmlized, by NCBIs tool - So I think I need
> NCBIWWW's parser.
>
>  - There are multiple sequences in my fasta input - So I need an
> iterator.
>
>  - NCBIWWW doesn't seem to provide an iterator, so I'm hoping to use
> NCBIStandalone's iterator. This assumption is suspect. But I don't yet
> know the biopython code base well enough to know a better alternative.
>
> Any help is greatly appreciated.
>
> Michael Cariaso
>
>
> <tyrkin2.html>_______________________________________________
> BioPython mailing list  -  BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython



More information about the BioPython mailing list