[Biojava-dev] Comments about OrderNDistributions
matthew_pocock at yahoo.co.uk
Tue Mar 4 17:23:11 EST 2003
You can have a distribution over codons of the form
P(ctg) by just using a normal probability distribution
over DNA x DNA x DNA - use the normal distribution
factory and it will just work.
If you use this distribution in an HMM, you must make
sure that you always look at non-overlapping codons,
or the probabilities won't add up and your model won't
be valid (and possibly even won't train).
--- Francois Pepin <fpepin at cs.mcgill.ca> wrote: >
After going through the code for the
> OrderNDistributions, there are a
> couple of comments and questions that I would have:
> Is there any reason why the conditional
> probabilities instead of joint
> probabilities are used there?
> Right now, for OrderNDistribution.getWeight(cgt) (or
> any codon) gives
> P(t|cg) while getting P(cgt) would be a lot more
> useful. It's quite easy
> to go from the joint to the conditional
> probabilities while getting the
> opposite information is pretty troublesome.
> To get P(cgt), one would need to get P(t|cg)*sum of
> P(g|nc)*sum of
> P(c|nn). (sum of
> P(g|nc)=P(g|ac)+P(g|cc)+P(g|gc)+P(g|tc) ).
> I don't really see why not store it as joint
> probabilities and not have
> to worry about the conditioning and conditioned
> alphabets there.
> Also, I'm not positive about this, but it seems that
> some information
> would be lost (or at least quite difficult to
> recover) about the first
> few characters of the distribution, for example with
> AACCCGGG, it there
> are no A's that would appear anywhere in a 3rd order
> (which would really
> be a 2nd order Markov chain) distributions. Two ways
> of going around it
> would be to carry all of the distributions of lower
> order to make sure
> that you have the data around, but it's a bit
> cumbersome, or to have the
> SymbolListViews.orderNSymbolList(AACCCGGG, 3) give
> out NNAACCCGGG in
> this case, and have the orderNDistributions keep
> that into account.
> What do people think about this?
> Francois Pepin
> biojava-dev mailing list
> biojava-dev at biojava.org
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
More information about the biojava-dev