[Bioperl-l] Hidden Markov Model in Bioperl?

Hilmar Lapp hlapp at gmx.net
Sun Mar 27 18:18:01 EST 2005


Sounds like a cool thing to have in bioperl.

Just one minor comment for naming, in perl/bioperl we typically 
DontUseCapitatilization to delineate words (like in Java) but put 
underscores. Otherwise to my knowledge you're breaking new ground here 
so there is no consistency check with the rest of bioperl to be passed, 
unless I'm missing something.

	-hilmar

On Friday, March 25, 2005, at 03:49  PM, Yee Man Chan wrote:

>
> Hi all
>
> 	I just wrote a C module to do Hidden Markov Model (HMM) related
> calculations. I find that there is no HMM implementation anywhere 
> (there
> are parsers for HMMER output however) in Bioperl. I think maybe it 
> will be
> a good idea for me to add this module to Bioperl?
>
> 	I am thinking of an interface like this:
>
> Bio::Tools::HMM->new("symbols", "states")
> - instantiate an HMM object with a string of symbols (each character
> corresponds to one symbol) and a string of states. Other parameters of 
> the
> model is generated randomly. Good for starting a Baum-Welch training.
>
> Bio::Tools::HMM->new("symbols", "states", array of initial state
> probabilities, matrix of state transition probabilities, matrix of
> emission probabilities)
> - similar to the one before but now we explicit assign the HMM 
> parameters.
>
> Bio::Tools::HMM->ObsSeqProb("string of observed sequence")
> - return the probability of an observed sequence.
>
> Bio::Tools::HMM->Viterbi("string of observed sequence")
> - return a string of hidden sequence that maximize the probability of 
> the
> happening of the observed sequence.
>
> Bio::Tools::HMM->BaumWelchTraining(array of observed sequences)
> - uses an array of observed sequences to find the HMM parameters that
> locally maximizes the probabilities of these observed sequences. 
> Optional
> parameters can be passed to change the tolerance and maximum number of
> iteration.
>
> Bio::Tools::HMM->StatisticalTraining(array of observed sequences, 
> array of
> hidden state sequences)
> - when the hidden state sequence is also known, use it to determine the
> parameter of an HMM using statistical method.
>
> Bio::Tools::HMM->getInitArray()
> - return the array of initial state probabilities as an @array
>
> Bio::Tools::HMM->getStateMatrix()
> - return the matrix of state transition probabilities as MatrixI
>
> Bio::Tools::HMM->getEmissionMatrix()
> - return the matrix of emission probabilities as MatrixI
>
> 	This should cover the most HMM applications. What do you think? Do
> you have other functions in mind?
>
> 	I already contributed Bio::Tools::dpAlign before, so I am not a
> newbie. If someone thinks it is a good idea to have this in Bioperl, I 
> can
> work on it as soon as possible.
>
> Best Regards,
> Yee Man
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list