[Bioperl-l] new methods for Bio::Align::DNAStatistics

Thu Jul 24 14:22:55 EDT 2003

Hello all,

There are 3 new methods in Bio::Align::DNAStatistics for looking at the
synonymous/nonsynonymous  substitutions in a protein coding
DNA alignment using the Nei-Gojobori method. The idea is that you can
get some idea of the positive/negative selection
on a sequence by  plugging in an alignment object, subject to caveats on
the sequence length/number of substitutions/weird nucleotide
composition.
The methods are as follows.

        1.  $stats_obj->calc_KaKs_Pair($alnobj, $seq_name1, $seq_name2);

                looks at a named pair of sequences in an alignment.
                returns an array containing 1 hash containing z_score,
number
of synonymous/non_synonymous changes, Ps and Pn, ds and dn (calculated
from Ps and Pn using Jukes Cantor), variances of ds and dn .

        2. $stats_obj->calc_all_KaKs_pairs($alnobj);

            as 1, but calculates data for all pairwise combinations and
returns them as an array.

        3. $stats_obj->calc_average_KaKs($alnobj)

                gets average Ds and Dn values over alignment and
calculates variance by bootstrapping.
                returns hash of av Ds and Dn, their variances, z_score.
                This is all Perl so wouldn't recommend it for large
           numbers of long sequences.

They are tested to some extent but would be grateful if any
phylogenetics experts out there would perhaps give them a workout
with their favourite alignments.

--
Dr Richard Adams
Bioinformatician,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU

Tel: 44 131 651 1084
richard.adams at ed.ac.uk