[BioPython] [DETECTED AS SPAM] Re: back-translation method for Seq object?

Peter biopython at maubp.freeserve.co.uk
Tue Oct 21 14:26:49 UTC 2008


Bruce wrote:
> Leighton wrote:
>> Each codon -> one amino acid : one-one mapping
>> Arg -> set of 6 possible codons : one-many mapping

I agree with Leighton.

> If you believed this then your answer below is incorrect.

No, I think you are just not using the terms one-to-one and
one-to-many as a mathematician would.

> The genetic code
> allow for 1 amino acid to map to a three nucleotides but not any three nor
> any more or any less than three. So to be clear there is a one to one
> mapping between a codon and amino acid as well amino acid and a codon.
> Therefore it is impossible for Arg to map to six possible codons as only one
> is correct. Under the standard genetic code, each amino acid can be
> represented in an regular expression either as the bases or ambiguous
> nucleotide codes:
> Ala/A =(GCT|GCC|GCA|GCG) = GCN

That is a one to four mapping using unambiguous nucleotides, or a one
to one mapping using ambiguous nucleotides.  This is a nice case.

> Leu/L =(TTA|TTG|CTT|CTC|CTA|CTG) = (TTN|CTR)

That is a one to six mapping using unambiguous nucleotides, or a one
to two mapping using ambiguous nucleotides.  This is a problem case.

> This is still a one to one mapping between an amino acid and regular
> expression relationship of the triplet that encodes it. Unfortunately the
> ambiguous nucleotide codes can not be used directly in a regular expression
> search.

The problem is that (TTN|CTR) or similar don't work in Seq objects -
would need a more advanced representation (perhaps based on regular
expressions).

Peter



More information about the Biopython mailing list