[EMBOSS] transeq and ambiguous codons
Peter
biopython at maubp.freeserve.co.uk
Thu Jul 9 05:28:20 EDT 2009
On Wed, Jul 8, 2009 at 10:50 PM, Peter<biopython at maubp.freeserve.co.uk> wrote:
> Hi all,
>
> Something I mentioned to Peter Rice in passing at BOSC/ISMB 2009 was
> I'd found an oddity in transeq with certain ambiguous codons while
> testing Biopython's translations. Here is a specific example (but I
> suspect there are more). For reference, I am expecting EMBOSS transeq
> to be using the NCBI tables:
> http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
>
> First consider the following example, the codon TAN, which can be TAA,
> TAC, TAG or TAT which translate to stop or Y. Therefore the
> translation of TAN should be "* or Y", and EMBOSS transeq opts for
> "X". Which is fine:
Using raw output instead of the default FASTA works better in emails:
$ transeq asis:TAATACTAGTATTAN -stdout -auto -osformat raw
*Y*YX
> Similarly for the codon TNN, again EMBOSS transeq opts for "X" because
> this could be a stop codon, or W, or F, or L, or S, or Y or C! Again,
> this is fine:
Again, using raw output works better in emails:
$ transeq asis:TNN -stdout -auto -osformat raw
X
> However, consider the codon TRR. R means A or G, so this can mean TAA,
> TGA, TAG or TGG which translate to stop or W (both EMBOSS and the NCBI
> standard table agree here). Therefore the translation of TRR should be
> "* or W", which I would expect based on the above examples to result
> in "X". But instead EMBOSS transeq gives "*":
Again, using raw output works better in emails:
$ transeq asis:TAATGATAGTGGTRR -stdout -auto -osformat raw
***W*
> I think this is a bug.
>
> However, I am aware that the machine I tried this on is rather old,
> and I don't actually know which version of EMBOSS it is.
I can check the old machine later, but I just retested on a Mac using
EMBOSS 6.0.1 (the current release), and see the same behaviour.
Peter C.
More information about the EMBOSS
mailing list