[Bioperl-l] Bio::Tools::BPlite::HSP - percentage IDs

simon potter scp@sanger.ac.uk
Wed, 11 Jul 2001 15:23:26 +0100 (BST)


Hello.

I've a question about parsing Blast output and how to get percentage
sequence identity.

In HSP.pm it is calculated by dividing the number of matches by the query
seq length, rather than the subject seq length (i.e. a re-calculation of
the % given in the blast output).

Is there a reason for calculating it in this way? I've talked to people
around here and the general feeling is that it's better to calculate wrt
the subject seq.

Question is - what to do about this? Is this something we should change -
maybe a solution is to provide a choice for how we %id?  What do people
think?

Thanks,

Simon Potter,
EnsEMBL team, Sanger