[EMBOSS] transeq and ambiguous codons

Peter biopython at maubp.freeserve.co.uk
Fri Jul 10 23:10:19 UTC 2009


On Fri, Jul 10, 2009 at 10:30 AM, Peter Rice<pmr at ebi.ac.uk> wrote:
>
> Peter C. wrote:
>>
>> OK, leaving TRR aside for the moment (I'm not sure I'd have done it that
>> way, but I think I follow your logic), I have some more problem cases for
>> you to consider (all using the default standard NCBI table 1).
>>
>> Most of these are 'unambiguous ambiguous codons' as you put it, and
>> I would agree using X when a more specific letter is possible isn't ideal
>> but isn't actually wrong. The "ATS" and related codons (see below)
>> however are simply wrong.
>
> They do look wrong. The "X when it could pick a residue" ones I knew of.
>
> The others need a closer look. The plan is to work through all possible
> codons and all the NCBI genetic codes as soon as the release is out.
>
> It should be a simple patch to ajtranslate.c when I'm done.
>

OK - I appreciate this is too last minute for the imminent EMBOSS release.

>> --------------------------------------------------------------------------------------
>>
>> Now for another debatable one, RAT means AAT or GAT which code
>> for N and D. So, you could use B (Asx) here rather than the broader X.
>>
>> Similarly, you don't use J to mean leucine (L) or to isoleucine (I), and
>> opt for X (again, this is justifiable). e.g. WTA
>
> Hmmm ... B and Z are ambiguity codes for amino acid analyser where all the
> amide bonds are broken and that includes N->D and Q->E. We used to have one
> of those in the lab. Similarly, J is for mass spec where I and L have the
> same molecular weight. I don't consider them appropriate for translation.

Well, as I said, this is debatable. On the one hand B and Z are IUPAC standards
(although J isn't yet), but amino acids don't have the full ambiguous alphabet
that we have for nucleotides so some might find such a translation surprising.
http://www.chem.qmul.ac.uk/iupac/AminoAcid/A2021.html

> So I plan to go for unique amino acids where possible with the ambiguity
> codes.

Good :)

Peter C.



More information about the EMBOSS mailing list