[Biopython-dev] comments on BLAT parser
Yair Benita
y.benita at wanadoo.nl
Mon Aug 8 10:49:16 EDT 2005
Hi All,
Jeff Chang and I made a few changes to the NCBIstandalone module and you may
now use it to parse BLAT output. Just a few comments on that:
1. BLAT can be run either using the BLAT program or the gfServer gfClient
programs. The testing was done using gfServer-gfClient version 32.
2. Use the option -out=blast to get the output file in BLAST format.
3. When using BLAT to compare a DNA query to a DNA database, everything
works perfectly well. However, when comparing a protein query to a
translated DNA database, there is a bug in the BLAST output. The subject
coordinates are wrong if the hit is on the opposite strand. This bug is
known and will be fixed in the next release of BLAT. For now, if you compare
proteins to a translated DNA database, use the psl format.
Below is an example for parsing the blat output (note that the query_end and
sbjct_end have also been added to the NCBIstandalone module).
Yair
##################################################
from Bio.Blast import NCBIStandalone
BlatFile = "blat_output.txt"
blast_out = open(BlatFile,'r')
b_parser = NCBIStandalone.BlastParser()
b_iterator = NCBIStandalone.Iterator(blast_out, b_parser)
while 1:
b_record = b_iterator.next()
if b_record is None:
break
print "Query used:", b_record.query
for hitX in b_record.alignments:
print "\t Target: ", hitX.title
for hspX in hitX.hsps:
print "\t\tQuery location: %s to %s" % ( hspX.query_start,
hspX.query_end)
print "\t\ttarget location: %s to %s" % ( hspX.sbjct_start,
hspX.sbjct_end)
print "\t\tstrand:", hspX.strand
print "\t\tscore: %s" % hspX.score
print "\t\tbits: %s" % hspX.bits
print "\t\texpect: %s" % hspX.expect
print "\t\tidentity:", hspX.identities
print "\t\t" + "-"*20
blast_out.close()
##################################################
More information about the Biopython-dev
mailing list