[BioPython] Parsing blast.out

Jeffrey Chang jchang@smi.stanford.edu
Tue, 14 May 2002 13:18:00 -0700


It looks like the format has changed.  Try grabbing the latest version 
of the file from the CVS.  If that doesn't work, send me your output 
file, and I'll see what's up.

Jeff



On Tuesday, May 14, 2002, at 12:05  PM, Ravinder Singh wrote:

> Hi,
> I'm trying to parse a blast output file and have tried both ways - i.e
> saving to a file then making a file handle or doing  the
> cStringIO.StringIO.
> I get the following error. Any help. Many thanks
> Ravinder
> *******
> ------------------------------------------------------------
> SyntaxError: Expected blank line, but got:
>            1,221,820 sequences; 5,507,506,871 total letters
>
> --------------------------------------------------------------
> I know that the blast works as it writes the blast output to a file. It
> gets stuck at the parsing . The problem occurs when I generate the
> b_record, using either handle. If I comment b_record1 line it prints
> neither C not D, however, if I comment b_record2 it printc C not D,
>
> b_record1 = blast_parser.parse(b_results)
> print 'C'
> b_record2 = blast_parser.parse(string_result_handle)
> print 'D'
>
> ****************
> If needed, my code is,
> ----------------------------------------------------------------
> #! /usr/local/bin/python
>
> from Bio import Fasta
>
> file_for_blast = open('m_cold.fasta', 'r')
> f_iterator = Fasta.Iterator(file_for_blast)
>
> f_record = f_iterator.next()
>
> from Bio.Blast import NCBIWWW
> b_results = NCBIWWW.blast('blastn', 'nr', f_record)
>
>
> save_file = open('my_blast.out', 'w')
> blast_results = b_results.read()
> save_file.write(blast_results)
> save_file.close()
>
> import cStringIO
> string_result_handle = cStringIO.StringIO(blast_results)
>
>
> b_results = open('my_blast.out', 'r')
>
>
> print 'A'
> from Bio.Blast import NCBIWWW
>
> blast_parser = NCBIWWW.BlastParser()
> print 'B'
>
> b_record = blast_parser.parse(b_results)
> print 'C'
>
> b_record = blast_parser.parse(string_result_handle)
>
> print 'D'
> *******************
> I'd like to do all of the following if and when the above code works.
> E_VALUE_THRESH = 0.04
>
> for alignment in b_record.alignments:
>  for hsp in alignment.hsps:
>   if hsp.expect < E_VALUE_THRESH:
>    print '****Alignment****'
>    print 'sequence:', alignment.title
>    print 'length:', alignment.length
>    print 'e value:', hsp.expect
>    print hsp.query[0:75] + '...'
>    print hsp.match[0:75] + '...'
>    print hsp.sbjct[0:75] + '...'
> --
> *************************************************************************
> *******
>
> Dr. Ravinder Singh
> Assistant Professor
> MCD Biology
> 347 UCB
> University of Colorado
> Boulder, CO 80309-0347
>
> (303)492-8886 (voice)
> (303)492-7744 (fax)
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython