[Biojava-dev] Modifications to DistributionTools.java

Lachlan Coin lc1 at sanger.ac.uk
Thu Feb 13 10:17:25 EST 2003


Hi,

I have been using DistributionTools.java and wanted to commit a few
changes.  If noone particularly objects, then I will go ahead and commit
these

  - I have added a method:
public Distribution jointDistributionOverAlignment(Alignment a,
                               boolean countGaps,
                               double nullWeight, int[] cols)
    this just returns the joint distribution of several columns in the
alignment.  It is useful for calculating mutual information for two
columns in an alignment

  -  I have changed
	public Map shannonEntropy(distribution observed, double logBase)

	 currently this creates a symbol->entropy map, and in the map it
puts p*log(1/p).  I think it is more natural to put log(1/p) in the map,
as this is a reflection of the uncertainty of a particular outcome (the
other is the weighted uncertainty).  I.e. if we have a weighted coin with
0.1% probability of heads, then  a head carries log(10) bits of
information.  I have also set things up so that the Map only has entries
for symbols which have non-zero probability.

   - Consequently, I have also changed

	public double bitsOfInformation(Distribution observed)

	as it reliead on shannonEntropy to calculate this.  It now
calculates the shannonEntropy map, and takes the average of the values in
this map, weighted according to the probability according to the observed
distribution.


I have also added some jUnit tests to test these methods.

Thanks,

Lachlan



-------------------------------------------------------------
Lachlan Coin
Wellcome Trust Sanger Institute		Magdalene College
Cambridge  CB10 1SA			Cambridge CB30AG
Ph: +44 1223 494 820
Fax: +44 1223 494 919
------------------------------------------------------------



More information about the biojava-dev mailing list