[Bioperl-l] PopGen Question

Michael Hague mhague at mail.sfsu.edu
Fri Mar 15 15:21:45 EDT 2013


Hello,
I'm trying to calculate some diversity statistics for an aligned dataset in BioPerl. My aligned dataset is a fasta file. I've been able to import the file and calculate Tajima's D and theta using the following script:


use Bio::AlignIO;
use Bio::PopGen::Utilities;
use Bio::PopGen::Statistics;
use Bio::PopGen::Population;

my $io1 = Bio::AlignIO->new(-file   => 'Callisaurus_cytb_final.fas',
                           				 -format => 'fasta');
my $aln1 = $io1->next_aln;


my $pop1 = Bio::PopGen::Utilities->aln_to_population(-alignment => $aln1,
													-include_monomorphic => 1);

my $stats1 = Bio::PopGen::Statistics->new();
my $D1 = $stats1->tajima_D($pop1);
my $theta1  = $stats1->theta($pop1);
my $segsites1 = $stats1->segregating_sites_count($pop1);

print "\nPopulation #1:\n";
print "Number of segregating sites = $segsites1\n";
print "Tajima\'s D = $D1\n";
print "Watterson\'s theta = $theta1\n\n";

However, I would like to also calculate haplotype diversity. To the best of my knowledge, I haven't found a BioPerl module that explicitly calculate haplotype diversity. Is there a relatively simple way to do this in BioPerl? I haven't even found a way to count allele frequencies from my alignment file. It that possible in BioPerl?

Thanks so much for the help,
Mike



More information about the Bioperl-l mailing list