[Biopython-dev] Questions about Codon Alignment Proposal

阮铮 rz1991 at foxmail.com
Sun Apr 28 18:59:33 UTC 2013

Thanks Peter and Eric,

I just finished my draft proposal for Codon Alignment Project. It can be found at

I didn't include model selection into my time line as it seems a little beyond the usage scope of codon alignment.
I'm looking forward to hearing your comments and suggestions. Thanks again.

Zheng Ruan

------------------ Original ------------------
From:  "Eric Talevich"<eric.talevich at gmail.com>;
Date:  Apr 28, 2013
To:  "阮铮"<rz1991 at foxmail.com>; "Peter Cock"<p.j.a.cock at googlemail.com>; 
Cc:  "Biopython-Dev Mailing List"<biopython-dev at biopython.org>; 
Subject:  Re: Questions about Codon Alignment Proposal

On Sat, Apr 27, 2013 at 4:11 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
 On Sat, Apr 27, 2013 at 6:23 PM, 阮铮 <rz1991 at foxmail.com> wrote:
 > Hi Eric and Peter,
 > I'm preparing the proposal for the codon alignment project. Two things I may
 > want to hear your advice.
 > 1) In the biopython wiki page, you mentioned "model selection" in the
 > Approach & Goals. I'm not sure if there are any advantages to use codon
 > alignment for model selection. Could you give me some references? Another
 > thing is that model selection involves estimation of tree topology as well
 > as branch lengthes and parameters across many substitution models. Will it
 > be too computationally intensive for a python implementation?
I'm not sure what Eric had in mind, but one option is to wrap an
 existing specialised tool specifically for model selection. One
 example I can think of is the graphical tool Topali2 (written by
 some of my current work colleagues) which I believe initially
 called Modelgenerator to do this, but now calls PhyML:

Actually, I added "model selection" to the project description because Peter mentioned it earlier. :)

I didn't have any particular function in mind, but calling out to external tools sounds like a reasonable approach. We do have some primitive functionality for likelihood ratio testing through the PAML module, so maybe that could come into play here.

 > 2) You also mentioned the "validation (testing for frame shift)". Is there a
 > test for frame shift? Or I can simply detect it by comparing amino acid
 > sequences and nucleotide sequences.
 > Best,
 > Zheng Ruan
Again, you'd have to ask Eric exactly what he had in mind.

Yes, it's just a matter of testing for unexpected cases in which the protein and nucleotide sequences don't quite match, and handling them appropriately. It would be nice to see a description of these possible edge cases and your treatment of them in the detailed weekly schedule for the project.

 Both these questions would probably be idea to ask on the
 NESCent Google Group - there should be some phylogenetic
 experts able to give a much more detailed answer than I can :)

More information about the Biopython-dev mailing list