[Bioperl-l] How to get the entropy of each nucleotide of analigment?

Mark A. Jensen maj at fortinbras.us
Sun Nov 23 09:30:24 EST 2008


Cláudio -

If you have a Bio::SimpleAlign object prepared ( maybe from

$alnio = new Bio::AlignIO(-format=>'fasta', -file=>'your.fas');
$aln = $alnio->next_aln;

try the following function, as

$entropies = entropy_by_column( $aln )

(which also gives an example of how you (or I, anyway) might 
manipulate
alignments on a per-column basis)

cheers, Mark

=head2 entropy_by_column

Title   : entropy_by_column
Usage   : entropy_by_column( $aln )
Function: returns the Shannon entropy for each column in an alignment
Example :
Returns : hashref of the form { $column_number => $entropy, ... }
Args    : a Bio::SimpleAlign object

=cut

sub entropy_by_column {
   my ($aln) = @_;
   my (%ent);
   foreach my $col (1..$aln->length) {
my %res;
foreach my $seq ($aln->each_seq) {
    my $loc = $seq->location_from_column($col);
    next if $loc->location_type eq 'IN-BETWEEN';
    $res{$seq->subseq($loc)}++;
}
$ent{$col} = entropy(values %res);
   }
   return [%ent];
}

=head2 entropy

Title   : entropy
Usage   : entropy( @numbers )
Function: returns the Shannon entropy of an array of numbers,
          each number represents the count of a unique category
          in a collection of items
Example : entropy ( 1, 1, 1 )  # returns 1.09861228866811 = log(1/3)
Returns : Shannon entropy or undef if entropy undefined;
Args    : an array

=cut

sub entropy {
   @a = map {$_ || ()} @_;
   return undef unless grep {$_>0} @a;
   return undef if grep {$_<0} @a;
   my $t = eval join('+', @a);
   map {$_ /= $t} @a;
   return eval(join('+', map { $_ ? -$_*log($_) : () } @a));
}

> ----- Original Message ----- 
> From: "Cláudio Sampaio" <patola at gmail.com>
> To: <bioperl-l at bioperl.org>
> Sent: Sunday, November 23, 2008 12:19 AM
> Subject: [Bioperl-l] How to get the entropy of each nucleotide of
> analigment?
>
>
> Hi all,
>
> I am still a newbie to bioperl, and while searching for a way to 
> calculate
> the entropy score of an alignment I came to Matrix scoring -
> http://doc.bioperl.org/releases/bioperl-1.4/Bio/Matrix/Scoring.html  
> - but
> couldn't figure out how it relate to the Bio::Align class and 
> objects. Can
> someone more knowledgeable give me a clue on how to start?
>
> Best regards,
>
> Cláudio "Patola"
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>



More information about the Bioperl-l mailing list