[EMBOSS] transeq and ambiguous codons
Peter
biopython at maubp.freeserve.co.uk
Wed Jul 8 17:50:19 EDT 2009
Hi all,
Something I mentioned to Peter Rice in passing at BOSC/ISMB 2009 was
I'd found an oddity in transeq with certain ambiguous codons which
testing Biopython's translations. Here is a specific example (but I
suspect there are more). For reference, I am expecting EMBOSS transeq
to be using the NCBI tables:
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
First consider the following example, the codon TAN, which can be TAA,
TAC, TAG or TAT which translate to stop or Y. Therefore the
translation of TAN should be "* or Y", and EMBOSS transeq opts for
"X". Which is fine:
$ transeq asis:TAATACTAGTATTAN -stdout -auto
>asis_1
*Y*YX
Similarly for the codon TNN, again EMBOSS transeq opts for "X" because
this could be a stop codon, or W, or F, or L, or S, or Y or C! Again,
this is fine:
$ transeq asis:TNN -stdout -auto >asis_1
X
However, consider the codon TRR. R means A or G, so this can mean TAA,
TGA, TAG or TGG which translate to stop or W (both EMBOSS and the NCBI
standard table agree here). Therefore the translation of TRR should be
"* or W", which I would expect based on the above examples to result
in "X". But instead EMBOSS transeq gives "*":
$ transeq asis:TAATGATAGTGGTRRTNN -stdout -auto
>asis_1
***W*X
I think this is a bug.
However, I am aware that the machine I tried this on is rather old,
and I don't actually know which version of EMBOSS it is. How can I
find out? As far as I know, there is no "-version" or "-v" or
"--version" switch, and the "-help" information doesn't include this
important piece of information. Nor is this in the FAQ:
http://emboss.sourceforge.net/docs/faq.html
So that makes two questions - how should transeq translate "TRR", and
how do I check the version of EMBOSS?
Thanks,
Peter C.
More information about the EMBOSS
mailing list