[Biopython] Back translation from Protein to RNA sequence

Peter Cock p.j.a.cock at googlemail.com
Mon Apr 7 08:58:26 EDT 2014


On Wed, Apr 2, 2014 at 5:33 PM, Ivan Gregoretti <ivangreg at gmail.com> wrote:
> The documentation of the Seq object nicely shows how to
>
> 1) transcribe DNA -> RNA,
> 2) back transcribe RNA -> DNA, and
> 3) translate RNA -> protein.
>
> If priorities allow, I would appreciate the expansion of the documentation
> with one example of
>
> 4) back translation protein -> most_probable_RNA.
>
> The result of that operation is species-dependent and worth documenting if
> the functionality already exists.
>
> Thank you,
>
> Ivan

Hello Ivan,

Biopython currently deliberately does not have any
back-translation functionality.

Why do you want this, and how would you define it?

I think 'most probable' would require a codon usage table
for the organism, and would need a tie breaker for when
two codons are equally frequent - or would you be happy
with non-deterministic output?

There are a whole set of details which would need to
be settled, such as what would you do with ambiguous
amino acids (e.g. X or J), making a general purpose
back-translate rather complex.

Last time this was discussed on the mailing list, the real
use case was back-translation as used with protein to
nucleotide alignment, where the sequence is known
and just the gaps need inserting appropriately. e.g.
https://github.com/peterjc/pico_galaxy/tree/master/tools/align_back_trans

Regards,

Peter


More information about the Biopython mailing list