[Biopython-dev] [Bug 2618] back_translate method for the Seq object (in Bio.Seq)?

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Oct 22 15:43:06 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2618


biopython-bugzilla at maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX




------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk  2008-10-22 11:43 EST -------
After some lively debate on the mailing list, we failed to come up with any
real world examples where a simple back_translate method (or a Bio.Seq
back_translate function) giving a string or Seq object would be useful.

A simple string or the current Seq object simply cannot represent all the
possible codons in a back translation.

Consider the standard table for leucine, Leu/L = {TTA, TTG, CTT, CTC, CTA, CTG}
= {TTR, CTN} which covers 6 unambiguous codons.

This is a subset of YTN = {TTC, TTA, TTG, TTT, CTC, CTA, CTG, CTT} which
covers 8 unambiguous codons.

Having back_translate("L") == "CTN" means translate(back_translate("L")) ==
"L", but doesn't cover the two codons TTR (i.e. TTA or TTG).  At least this is
better than back_translate("L") == "TTR" which still has
translate(back_translate("L")) == "L", but doesn't cover the four codons CTN. 
Picking any one of the six codons also ensures translate(back_translate("L"))
== "L" but of course doesn't cover the other five codons.  In all three cases,
the utility of the back translation is limited (e.g. no help for searches).

Having back_translate("L") == "YTN" means translate(back_translate("L")) ==
"X", which would surprise many. Using "YTN" covers all the codons plus some
extra ones.  This might be useful for searching purposes, but otherwise its
very misleading.

However, while I am marking this bug as WONTFIX, returning a more complex
ambiguous sequence representation (e.g. using regular expressions) may have
merit.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list