[Biojava-l] HMM's - Attempting some fancy stuff
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Fri Mar 24 02:28:04 UTC 2006
I think you could do a palindrome as a push-down automaton or similar.
Alternatively you could do something like a HMM with emission duration as
in Borodovsky's GeneMarkHMM programs but that would require a lot of new
code for the DP library (good to have though).
To use a Dirichlet mixture as your background you could calculate one and give it to a Distribution
although it might be best to implement the Distribution interface with a
class that generates one for you. To go to higer order models you just
need a higher order alphabet
(http://biojava.org/wiki/BioJava:Cookbook:Alphabets:CrossProduct) and
possibly use an OrderNDistribution for background and emission
(http://biojava.org/wiki/BioJava:CookBook:Distribution:Custom)
- Mark
Todd Riley <toddri at eden.rutgers.edu>
Sent by: biojava-l-bounces at lists.open-bio.org
03/24/2006 07:04 AM
To: Francois Pepin <fpepin at aei.ca>
cc: biojava-l at biojava.org, Mark Schreiber/GP/Novartis at PH
Subject: Re: [Biojava-l] HMM's - Attempting some fancy stuff
Yes, I agree that the palindromes are not always identical. However,
often my unaligned training data is not complete enough to train the
model well without some simplification. So far, I have been using
Cross-validation, sensitivity, and specificity to determine the
effectiveness of this simplification approach.
-Todd
Francois Pepin wrote:
>>1. Many of the TFBS sites that I am modeling are palindromic or
>>repetitive. I wish to associate transition and emission distributions
>>(as prior knowledge) during training in order to enforce a palindromic
>>and/or repetitive pattern and thus also greatly reduce the parameter
space.
>>
>>
>
>Just as a note, we haven't found this to be ideal, if you have
>sufficient training data. It is often the case that one of the
>palindromes is more conserved than the other, and you would treating
>them the same way.
>
>Of course, it depends how much of an in-depth study you'll want to be
>doing.
>
>Francois
>
>
>
_______________________________________________
Biojava-l mailing list - Biojava-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l
More information about the Biojava-l
mailing list