[Biopython-dev] comments on BLAT parser

Yair Benita y.benita at wanadoo.nl
Mon Aug 8 10:49:16 EDT 2005


Hi All,
Jeff Chang and I made a few changes to the NCBIstandalone module and you may
now use it to parse BLAT output. Just a few comments on that:

1. BLAT can be run either using the BLAT program or the gfServer gfClient
programs. The testing was done using gfServer-gfClient version 32.

2. Use the option -out=blast to get the output file in BLAST format.

3. When using BLAT to compare a DNA query to a DNA database, everything
works perfectly well. However, when comparing a protein query to a
translated DNA database, there is a bug in the BLAST output. The subject
coordinates are wrong if the hit is on the opposite strand. This bug is
known and will be fixed in the next release of BLAT. For now, if you compare
proteins to a translated DNA database, use the psl format.

Below is an example for parsing the blat output (note that the query_end and
sbjct_end have also been added to the NCBIstandalone module).

Yair

##################################################
from Bio.Blast import NCBIStandalone

BlatFile = "blat_output.txt"
blast_out = open(BlatFile,'r')
b_parser = NCBIStandalone.BlastParser()
b_iterator = NCBIStandalone.Iterator(blast_out, b_parser)


while 1:
    b_record = b_iterator.next()

    if b_record is None:
        break
        
    print "Query used:", b_record.query
    
    for hitX in b_record.alignments:
        print "\t Target: ", hitX.title

        for hspX in hitX.hsps:
            print "\t\tQuery location: %s to %s" % ( hspX.query_start,
                                                        hspX.query_end)
            print "\t\ttarget location: %s to %s" % ( hspX.sbjct_start,
                                                        hspX.sbjct_end)
            print "\t\tstrand:", hspX.strand
            print "\t\tscore: %s" % hspX.score
            print "\t\tbits: %s" % hspX.bits
            print "\t\texpect: %s" % hspX.expect
            print "\t\tidentity:", hspX.identities
            print "\t\t" + "-"*20

blast_out.close()
##################################################





More information about the Biopython-dev mailing list