[BioPython] blast

Peter Cock p.j.a.cock at googlemail.com
Wed Jul 23 15:36:40 UTC 2008


> Sorry, both give me wrong percentage if I try on my database.
>
> Look here, compare alignment and percentage:
>
> http://picasaweb.google.de/luecks/Python02/photo#5226228430697476754
>
> e.g Hit 13 should give 60 %
>
> What you recommend to use instead of hsp.score?

I am assuming you want to parse some BLAST output in order to populate
this database.  How about something based on this:

from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("test.xml")) :
    print "Query %i length %i" % (record.query_id, record.query_letters)
    for alignment in record.alignments :
        for hsp in alignment.hsps :
            percentage_identities_versus_full_query = (100.0 *
hsp.identities) / record.query_letters
            print " vs %s gives %0.1f%% identities" \
              % (alignment.hit_id, percentage_identities_versus_full_query)

This uses the fact the the original query length is recorded in the
record object as the "query_letters" property (this name was a
historical choice based on the plain text blast output).

For the example you gave, then then I would expect hsp.identities ==
12 (and hsp.alignment_length == 12) while record.query_letters == 60
which will give you the desired output of 60%.

Peter



More information about the Biopython mailing list