[BioPython] Translation method for Seq object

Bruce Southey bsouthey at gmail.com
Mon Oct 13 14:58:07 UTC 2008


Peter wrote:
> Dear Biopythoneers,
>
> This is a request for feedback about proposed additions to the Seq
> object for the next release of Biopython.  I'd like people to pick (a)
> to (e) in the list below (with additional comments or counter
> suggestions welcome).
>
> Enhancement bug 2381 is about adding transcription and translation
> methods to the Seq object, allowing an object orientated style of
> programming.
>
> e.g. Current functional programming style:
>
>   
>>>> from Bio.Seq import Seq, transcribe
>>>> from Bio.Alphabet import generic_dna
>>>> my_seq = Seq("CAGTGACGTTAGTCCG", generic_dna)
>>>> my_seq
>>>>         
> Seq('CAGTGACGTTAGTCCG', DNAAlphabet())
>   
>>>> transcribe(my_seq)
>>>>         
> Seq('CAGUGACGUUAGUCCG', RNAAlphabet())
>
> With the latest Biopython in CVS, you can now invoke a Seq object
> method instead for transcription (or back transcription):
>
>   
>>>> my_seq.transcribe()
>>>>         
> Seq('CAGUGACGUUAGUCCG', RNAAlphabet())
>
> For a comparison, compare the shift from python string functions to
> string methods.  This also makes the functionality more discoverable
> via dir(my_seq).
>
> Adding Seq object methods "transcribe" and "back_transcribe" doesn't
> cause any confusion with the python string methods.  However, for
> translation, the python string has an existing "translate" method:
>
>   
>> S.translate(table [,deletechars]) -> string
>>
>> Return a copy of the string S, where all characters occurring
>> in the optional argument deletechars are removed, and the
>> remaining characters have been mapped through the given
>> translation table, which must be a string of length 256.
>>     
>
> I don't think this functionality is really of direct use for sequences, and
> having a Seq object "translate" method do a biological translation into
> a protein sequence is much more intuitive. However, this could cause
> confusion if the Seq object is passed to non-Biopython code which
> expects a string like translate method.
>
> To avoid this naming clash, a different method name would needed.
>
> This is where some user feedback would be very welcome - I think
> the following cover all the alternatives of what to call a biological
> translation function (nucleotide to protein):
>
> (a) Just use translate (ignore the existing string method)
> (b) Use translate_ (trailing underscore, see PEP8)
> (c) Use translation (a noun rather than verb; different style).
> (d) Use something else (e.g. bio_translate or ...)
> (e) Don't add a biological translation method at all because ...
>
> Thanks,
>
> Peter
>
> See also http://bugzilla.open-bio.org/show_bug.cgi?id=2381
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>   
Hi,
My thoughts on this is that it is generally best to avoid any confusion 
when possible. But 'translate' is not a reserved word and the Python 
documentation notes that the unicode version lacks the optional 
deletechars argument (so there is precedent for using the same word). 
Also it involves the methods versus functions argument but many of the 
string functions have been depreciated and will get removed in Python 
3.0 (so in Python 3.0 I think it will be hard to get a name clash 
without some strange inheritance going on).

Therefore, provided 'translate' is a method of Seq then I do not see any 
strong reason to avoid it except that it is long (but shorter than 
translation) :-)

Would be too cryptic to have dna(), rna() and protein() methods that 
provide the appropriate conversion based on the Seq type?
Obviously reverse translation of a protein sequence to a DNA sequence is 
complex if there are many solutions.

Regards
Bruce





More information about the Biopython mailing list