[Bioperl-l] script to calculate percentage identity for BLAT psl

gopu_36 gopu_36 at yahoo.com
Fri Mar 13 10:55:14 UTC 2009


Hi 
As given in that page I tried as below as one of the script. But I always
get 100% identity. Please let me know the problem in my following code.
Since I don't use any options and am also mapping DNA sequences, I have made
following changes in the code: Please let me know the problem.

1
 my $sizeMul = 1; and commented:
#if ($option{p}) {
         #        $sizeMul = 3;
         #} else {
         #        $sizeMul = 1;
         #}
2
if ($sizeDiff < 0) {
         	$sizeDiff = 0;
                 #if ($option{m}) {
                         #$sizeDiff = 0;
                 #} else {
                         #$sizeDiff = -($sizeDiff);
                 #}
         }

=================================================== 
And perl script is as bleow: Thanks.
 

#!/usr/bin/perl

while(<>) {
        chomp $_;
        my @v = split(/\t/,$_);
        get_pid($_);

}
sub get_pid {
         my @line = @_;
         my $pid = (100.0 - (&pslCalcMilliBad(@line) * 0.1));
	 print "The percentage: $pid\n";
         #return $pid;
}

sub pslCalcMilliBad {
         my @cols = @_;

         # sizeNul depens of dna/Prot
         my $sizeMul = 1;
         #if ($option{p}) {
         #        $sizeMul = 3;
         #} else {
         #        $sizeMul = 1;
         #}

         # cols[0]  matches
         # cols[1]  misMatches
         # cols[2]  repMaches
         # cols[4]  qNumInsert
         # cols[6]  tNumInsert
         # cols[11] qStart
         # cols[12] qEnd
         # cols[15] tStart
         # cols[16] tEnd

         my $qAliSize = $sizeMul * ($cols[12] - $cols[11]);
         my $tAliSize = $cols[16] - $cols[15];

         # I want the minimum of qAliSize and tAliSize
         my $aliSize;
         $qAliSize < $tAliSize ? $aliSize = $qAliSize : $aliSize =  
$tAliSize;

         # return 0 is AliSize == 0
         return 0 if ($aliSize <= 0);

         # size diff
         my $sizeDiff = $qAliSize - $tAliSize;
         if ($sizeDiff < 0) {
         	$sizeDiff = 0;
                 #if ($option{m}) {
                         #$sizeDiff = 0;
                 #} else {
                         #$sizeDiff = -($sizeDiff);
                 #}
         }

         # insert Factor
         my $insertFactor = $cols[4];
         $insertFactor += $cols[6] unless ($option{m});
         my $milliBad = (1000 * ($cols[1]*$sizeMul + $insertFactor +  
&round(3*log( 1 + $sizeDiff)))) / ($sizeMul * ($cols[0] + $cols[2]
+ $cols[1]));
	        return $milliBad;

}

sub round {
         my $number = shift;
         return int($number + .5);
}



gopu_36 wrote:
> 
> Hi,
> 
> I did go through FAQ from BLAT on how to calculate the precentage identity
> from http://genome.ucsc.edu/FAQ/FAQblat#blat4
> As a new comer, I don;t usederstand on how to implement this. Please let
> me know how to plugin the script for my output.psl file. Please let me
> know. It would be of great help.
> 
> Thanks and Regards.
> 

-- 
View this message in context: http://www.nabble.com/script-to-calculate-percentage-identity-for-BLAT-psl-tp22476983p22494163.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.




More information about the Bioperl-l mailing list