[BioPython] blast
Peter Cock
p.j.a.cock at googlemail.com
Wed Jul 23 15:36:40 UTC 2008
> Sorry, both give me wrong percentage if I try on my database.
>
> Look here, compare alignment and percentage:
>
> http://picasaweb.google.de/luecks/Python02/photo#5226228430697476754
>
> e.g Hit 13 should give 60 %
>
> What you recommend to use instead of hsp.score?
I am assuming you want to parse some BLAST output in order to populate
this database. How about something based on this:
from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("test.xml")) :
print "Query %i length %i" % (record.query_id, record.query_letters)
for alignment in record.alignments :
for hsp in alignment.hsps :
percentage_identities_versus_full_query = (100.0 *
hsp.identities) / record.query_letters
print " vs %s gives %0.1f%% identities" \
% (alignment.hit_id, percentage_identities_versus_full_query)
This uses the fact the the original query length is recorded in the
record object as the "query_letters" property (this name was a
historical choice based on the plain text blast output).
For the example you gave, then then I would expect hsp.identities ==
12 (and hsp.alignment_length == 12) while record.query_letters == 60
which will give you the desired output of 60%.
Peter
More information about the Biopython
mailing list