[Biopython-dev] [Bug 2381] translate and transcibe methods for the Seq object (in Bio.Seq)
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Tue Nov 4 16:11:49 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2381
------- Comment #33 from biopython-bugzilla at maubp.freeserve.co.uk 2008-11-04 11:11 EST -------
(In reply to comment #32)
> > In which of these examples do you understand that the first position is
> > being forced to a Methionine?
With my suggested code, you would not just be forcing the first codon to be a
methionine. You would also be asking for the first codon to be validated as a
start codon (initialisation codon).
> None are particularly clear, but only one of them doesn't give me the wrong
> idea...
In some cases I seem to have guessed different possible meanings for some of
these suggested names - so those are probably unclear.
> > >>> translate("TTGAAACCCTAG", init=True, to_stop=True)
>
> Because I've read this thread (or looked at the docs) - I understand this one
> ;)
To me this suggests something special is happening with the initialisation of
the translation - but I agree its not clear what without checking the
documentation.
> > >>> translate("TTGAAACCCTAG", force_as_translating=True, to_stop=True)
>
> I don't intuitively understand this. Does it mean that the sequence should be
> translatable?
Ditto - an argument called force_as_translating means nothing to me. You're
calling a translation method so what can forcing a translation mean?
> > >>> translate("TTGAAACCCTAG", force_methionine=True, to_stop=True)
>
> Does this mean that the sequence will be translated from the first methionine
> the method finds?
I would have guessed force_methionine would ignore the value of the first three
nucleotides in order to treat them as a methionine (even if they are not a
start codon).
> > >>> translate("TTGAAACCCTAG", force_methionine=True, force_stop=True)
>
> As above, and does force_stop mean that you add a '*' to the end of the
> translation? Or that you stop at a stop codon?
Like Leighton, I would be confused by "force_stop". It could mean add a stop
symbol to the end of the amino acid sequence even if there isn't one there
already.
> > >>> translate("TTGAAACCCTAG", alt_start=True, alt_stop=True)
>
> 'alt_start' I would think referred to allowing translation from alternative
> start codons. I don't know what alt_stop would mean...
I think "alt_start" would be misleading for the intended dual functionality.
Consider the typical use case for this option - translating a CDS, which most
of the time will use the typical start codon AUG / ATG (but not all ways).
We'd want the start codon validated - and it often won't be an alternative
start codon. So calling the argument "alt_start" is confusing.
> > Also, I don't think this option will be used very often.
>
> Maybe not. The first use case that comes to mind is QA on CDS-finding:
>
> # Check if sequence is CDS:
> assert candidate_cds.translate(init=True)
> # Check if reported CDS start is valid
> assert est[37:].translate(init=True)
>
> A second use case is slower in presenting itself...
I think translating a CDS is quite a common task - so a very long argument
would be bad.
Instead of the "init" start codon option in attachment 1032, I'd also be happy
with a single boolean argument which does start codon validation, treats this
as a methionine, checks the sequence is a multiple of three in length, checks
for a final stop codon, and checks for no additional stop codons. We'd ruled
out calling this "complete", but maybe "cds" would be better?
> > So, it shouldn't be a problem if its name is too long to type, and it would
> > be better if it is easy to understand.
>
> That's a fair argument, I think. On the whole, though, I would favour a
> short, unambiguous, slightly cryptic name over a very long, unambiguous
> name, over an ambiguous name of any length.
There is a lot of subjectiveness in argument naming - clearly we have not come
up with a perfect suggestion yet.
Unfortunately "init" can be misunderstood (I'm not 100% sure what you were
trying to say in comment 31, but I think you thought from the name "init" could
be some sort of optional optimisation initialisation).
How about "cds_start" instead of "init"?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list