[Biojava-l] unsupervised training of transition weights
Thomas Down
td2 at sanger.ac.uk
Fri Mar 31 10:58:38 UTC 2006
On 30 Mar 2006, at 16:41, wendy wong wrote:
> Hi,
>
> I am trying to train my HMM using unsupervised training (I don't need
> to train the emission probabilities). I was wondering how I can do so
> in biojava. do I have to implement the TransitionTrainer interface?
The easiest way to do this is to use UntrainableDistributions for all
the transition-sets that you don't want to be trained:
http://www.biojava.org/docs/api14/org/biojava/bio/dist/
UntrainableDistribution.html
If UntrainableDistribution doesn't fit your requirements, the
alternative is to create your own Distribution implementation with a
registerTrainer method that creates a "dummy" (i.e. doesn't do
anything) DistributionTrainer. UntrainableDistribution is just a
subclass of SimpleDistribution which replaces the registerTrainer
method with a non-functional version.
> my second question is:
> I implemnted getWeightImpl in my custom distribution to set up my
> emission states and it works fine. but is it possible to get the
> program to access it only when there's certain symbol in the observed
> sequence, (instead of precalculated)? and I also found that (although
> I might be wrong) the weights are calculated twice, once was when the
> distribution was created, and then when I call viterbi it calls
> getWeightImpl again. I am not sure what I did wrong here :(
The DP code does some caching of probabilities, I don't think there's
any way to turn this off without modifying the DP implementations.
Thomas.
More information about the Biojava-l
mailing list