[BioPython] blast

Peter Cock p.j.a.cock at googlemail.com
Wed Jul 23 14:05:14 UTC 2008


I wrote:
> I think they are asking for the percentage identify.  Another
> possiblity is the bit-score which should be in the BLAST output as a
> floating point number (hsp.score).

I should have said raw score (hsp.score) or bit score (hsp.bits), both
of which might be of interest to your users.

Steganie wrote:
> Hi!
>
> I calculated like this for my program (you blast against your own primer
> stock database and get % match per primers to your query sequence):
>
> percent = float(100) * float(hsp.score) / float(alignment.length)
>

Using hsp.score gives the raw score, which you are then scaling by the
length and 100.  I'm not sure offhand what you've calculated, but if
you want the percentage identity, I think its just the number of
identically match letters divided by the alignment length:

percentage_identity = (100.0 * hsp.identities) / hsp.align_length

In the plain text output BLAST gives this number explicitly, e.g.
Identities = 112/146 (76%), Positives = 127/146 (86%), Gaps = 2/146 (1%)

There is also hsp.bits which gives the bit score.

Peter



More information about the Biopython mailing list