[EMBOSS] transeq and ambiguous codons

Peter biopython at maubp.freeserve.co.uk
Wed Jul 8 21:50:19 UTC 2009


Hi all,

Something I mentioned to Peter Rice in passing at BOSC/ISMB 2009 was
I'd found an oddity in transeq with certain ambiguous codons which
testing Biopython's translations. Here is a specific example (but I
suspect there are more). For reference, I am expecting EMBOSS transeq
to be using the NCBI tables:
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi

First consider the following example, the codon TAN, which can be TAA,
TAC, TAG or TAT which translate to stop or Y. Therefore the
translation of TAN should be "* or Y", and EMBOSS transeq opts for
"X". Which is fine:

$ transeq asis:TAATACTAGTATTAN -stdout -auto
>asis_1
*Y*YX

Similarly for the codon TNN, again EMBOSS transeq opts for "X" because
this could be a stop codon, or W, or F, or L, or S, or Y or C! Again,
this is fine:

$ transeq asis:TNN -stdout -auto >asis_1
X

However, consider the codon TRR. R means A or G, so this can mean TAA,
TGA, TAG or TGG which translate to stop or W (both EMBOSS and the NCBI
standard table agree here). Therefore the translation of TRR should be
"* or W", which I would expect based on the above examples to result
in "X". But instead EMBOSS transeq gives "*":

$ transeq asis:TAATGATAGTGGTRRTNN -stdout -auto
>asis_1
***W*X

I think this is a bug.

However, I am aware that the machine I tried this on is rather old,
and I don't actually know which version of EMBOSS it is. How can I
find out? As far as I know, there is no "-version" or "-v" or
"--version" switch, and the "-help" information doesn't include this
important piece of information. Nor is this in the FAQ:
http://emboss.sourceforge.net/docs/faq.html

So that makes two questions - how should transeq translate "TRR", and
how do I check the version of EMBOSS?

Thanks,

Peter C.



More information about the EMBOSS mailing list