[Biopython-dev] Questions about Codon Alignment Proposal

Eric Talevich eric.talevich at gmail.com
Sat Apr 27 18:25:33 EDT 2013


On Sat, Apr 27, 2013 at 4:11 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Sat, Apr 27, 2013 at 6:23 PM, 阮铮 <rz1991 at foxmail.com> wrote:
> > Hi Eric and Peter,
> >
> > I'm preparing the proposal for the codon alignment project. Two things I
> may
> > want to hear your advice.
> >
> > 1) In the biopython wiki page, you mentioned "model selection" in the
> > Approach & Goals. I'm not sure if there are any advantages to use codon
> > alignment for model selection. Could you give me some references? Another
> > thing is that model selection involves estimation of tree topology as
> well
> > as branch lengthes and parameters across many substitution models. Will
> it
> > be too computationally intensive for a python implementation?
>
> I'm not sure what Eric had in mind, but one option is to wrap an
> existing specialised tool specifically for model selection. One
> example I can think of is the graphical tool Topali2 (written by
> some of my current work colleagues) which I believe initially
> called Modelgenerator to do this, but now calls PhyML:
>
> http://www.topali.org
> http://bioinf.nuim.ie/modelgenerator/
>

Actually, I added "model selection" to the project description because
Peter mentioned it earlier. :)

I didn't have any particular function in mind, but calling out to external
tools sounds like a reasonable approach. We do have some primitive
functionality for likelihood ratio testing through the PAML module, so
maybe that could come into play here.

> 2) You also mentioned the "validation (testing for frame shift)". Is
> there a
> > test for frame shift? Or I can simply detect it by comparing amino acid
> > sequences and nucleotide sequences.
> >
> > Best,
> > Zheng Ruan
>
> Again, you'd have to ask Eric exactly what he had in mind.
>

Yes, it's just a matter of testing for unexpected cases in which the
protein and nucleotide sequences don't quite match, and handling them
appropriately. It would be nice to see a description of these possible edge
cases and your treatment of them in the detailed weekly schedule for the
project.



> Both these questions would probably be idea to ask on the
> NESCent Google Group - there should be some phylogenetic
> experts able to give a much more detailed answer than I can :)
> https://plus.google.com/communities/105828320619238393015
>
>



More information about the Biopython-dev mailing list