[Biopython] Back translation support in Biopython

Peter Cock p.j.a.cock at googlemail.com
Wed Apr 4 15:02:41 UTC 2012


On Wed, Apr 4, 2012 at 2:49 AM, Eric Talevich <eric.talevich at gmail.com> wrote:
> Hi Igor,
>
> It sounds like you're referring to aligning amino acid sequences to codon
> sequences, as PAL2NAL does. This is different from what most people mean by
> back translation, but as you point out, certainly useful.
>
> If you write a function that can match a protein sequence alignment to a set
> of raw CDS sequences, returning a nucleotide alignment based on the
> codon-to-amino-acid mapping, that would be useful. However, PAL2NAL does
> exactly that, plus a bit more, and is a fairly well-known and easily
> obtained program. Personally, I would prefer to write a wrapper for PAL2NAL
> under Bio.Align.Applications, using the existing Bio.Applications framework.

As per the old thread, a simple function in Python taking the gapped protein
sequence, original nucleotide coding sequence, and the translation table
does sound useful. Then using that, you could go from a protein alignment
plus the original nucleotide coding sequences to a codon alignment, or
other tasks. Given this is all relatively straightforward string manipulation
and we already have the required genetic code tables in Biopython, I'm not
convinced that wrapping PAL2NAL would be the best solution (for this sub
task).

> Once the user has a codon alignment, dn/ds and many other calculations based
> on evolutionary models can be performed with our PAML wrappers, under
> Bio.Phylo.PAML. I agree there is room in Biopython to make this workflow
> easier to perform. (Although I wouldn't be able to mentor such a project
> under GSoC this year.)

Doing some of the calculations directly within Biopython could be
interesting and useful - although calling PAML is a very pragmatic
solution too.

I'm note sure you have enough work here to justify a GSoC project,
but the timing is also rather tight to find a suitable mentor. Maybe next
year? However, you can still start contributing to Biopython now - and
such involvement would be viewed positively on a future GSoC
application (not just with us, but for other participating project being
about to show past contributions to open source projects is good).

Regards,

Peter



More information about the Biopython mailing list