[EMBOSS] transeq and ambiguous codons
Peter Rice
pmr at ebi.ac.uk
Fri Jul 10 09:30:52 UTC 2009
Peter C. wrote:
> OK, leaving TRR aside for the moment (I'm not sure I'd have done it that
> way, but I think I follow your logic), I have some more problem cases for
> you to consider (all using the default standard NCBI table 1).
>
> Most of these are 'unambiguous ambiguous codons' as you put it, and
> I would agree using X when a more specific letter is possible isn't ideal
> but isn't actually wrong. The "ATS" and related codons (see below)
> however are simply wrong.
They do look wrong. The "X when it could pick a residue" ones I knew of.
The others need a closer look. The plan is to work through all possible
codons and all the NCBI genetic codes as soon as the release is out.
It should be a simple patch to ajtranslate.c when I'm done.
> --------------------------------------------------------------------------------------
>
> Now for another debatable one, RAT means AAT or GAT which code
> for N and D. So, you could use B (Asx) here rather than the broader X.
>
> Similarly, you don't use J to mean leucine (L) or to isoleucine (I), and
> opt for X (again, this is justifiable). e.g. WTA
Hmmm ... B and Z are ambiguity codes for amino acid analyser where all the
amide bonds are broken and that includes N->D and Q->E. We used to have one
of those in the lab. Similarly, J is for mass spec where I and L have the
same molecular weight. I don't consider them appropriate for translation.
So I plan to go for unique amino acids where possible with the ambiguity codes.
What do our users think?
regards,
Peter
More information about the EMBOSS
mailing list