[Biojava-dev] Modifications to DistributionTools.java
Matthew Pocock
matthew_pocock at yahoo.co.uk
Fri Feb 14 20:55:10 EST 2003
Lachlan Coin wrote:
> Hi,
>
> I have been using DistributionTools.java and wanted to commit a few
> changes. If noone particularly objects, then I will go ahead and commit
> these
>
> - I have added a method:
> public Distribution jointDistributionOverAlignment(Alignment a,
> boolean countGaps,
> double nullWeight, int[] cols)
> this just returns the joint distribution of several columns in the
> alignment. It is useful for calculating mutual information for two
> columns in an alignment
Sounds great. Commit this one.
>
> - I have changed
> public Map shannonEntropy(distribution observed, double logBase)
>
> currently this creates a symbol->entropy map, and in the map it
> puts p*log(1/p). I think it is more natural to put log(1/p) in the map,
> as this is a reflection of the uncertainty of a particular outcome (the
> other is the weighted uncertainty). I.e. if we have a weighted coin with
> 0.1% probability of heads, then a head carries log(10) bits of
> information. I have also set things up so that the Map only has entries
> for symbols which have non-zero probability.
Not sure about this. My information theory is ropey at best, but I
thought that the information of a probability was (- p * log (p)) or
equivalently p * log (1/p) but perhaps I'm wrong. Could someone who
knows tell me?
>
> - Consequently, I have also changed
>
> public double bitsOfInformation(Distribution observed)
>
> as it reliead on shannonEntropy to calculate this. It now
> calculates the shannonEntropy map, and takes the average of the values in
> this map, weighted according to the probability according to the observed
> distribution.
>
>
> I have also added some jUnit tests to test these methods.
>
> Thanks,
>
> Lachlan
>
>
>
> -------------------------------------------------------------
> Lachlan Coin
> Wellcome Trust Sanger Institute Magdalene College
> Cambridge CB10 1SA Cambridge CB30AG
> Ph: +44 1223 494 820
> Fax: +44 1223 494 919
> ------------------------------------------------------------
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at biojava.org
> http://biojava.org/mailman/listinfo/biojava-dev
>
--
BioJava Consulting LTD - Support and training for BioJava
http://www.biojava.co.uk
More information about the biojava-dev
mailing list