[BioPython] Translation method for Seq object

Eric Gibert ericgibert at yahoo.fr
Mon Oct 13 14:38:02 UTC 2008


(a) Seq is an object, string is another object... each of them have various
methods and coincidently two of them have the same name...

Eric


-----Original Message-----
From: biopython-bounces at lists.open-bio.org
[mailto:biopython-bounces at lists.open-bio.org] On Behalf Of Peter
Sent: Monday, October 13, 2008 8:39 PM
To: BioPython Mailing List
Subject: [BioPython] Translation method for Seq object

Dear Biopythoneers,

This is a request for feedback about proposed additions to the Seq
object for the next release of Biopython.  I'd like people to pick (a)
to (e) in the list below (with additional comments or counter
suggestions welcome).

Enhancement bug 2381 is about adding transcription and translation
methods to the Seq object, allowing an object orientated style of
programming.

e.g. Current functional programming style:

>>> from Bio.Seq import Seq, transcribe
>>> from Bio.Alphabet import generic_dna
>>> my_seq = Seq("CAGTGACGTTAGTCCG", generic_dna)
>>> my_seq
Seq('CAGTGACGTTAGTCCG', DNAAlphabet())
>>> transcribe(my_seq)
Seq('CAGUGACGUUAGUCCG', RNAAlphabet())

With the latest Biopython in CVS, you can now invoke a Seq object
method instead for transcription (or back transcription):

>>> my_seq.transcribe()
Seq('CAGUGACGUUAGUCCG', RNAAlphabet())

For a comparison, compare the shift from python string functions to
string methods.  This also makes the functionality more discoverable
via dir(my_seq).

Adding Seq object methods "transcribe" and "back_transcribe" doesn't
cause any confusion with the python string methods.  However, for
translation, the python string has an existing "translate" method:

> S.translate(table [,deletechars]) -> string
>
> Return a copy of the string S, where all characters occurring
> in the optional argument deletechars are removed, and the
> remaining characters have been mapped through the given
> translation table, which must be a string of length 256.

I don't think this functionality is really of direct use for sequences, and
having a Seq object "translate" method do a biological translation into
a protein sequence is much more intuitive. However, this could cause
confusion if the Seq object is passed to non-Biopython code which
expects a string like translate method.

To avoid this naming clash, a different method name would needed.

This is where some user feedback would be very welcome - I think
the following cover all the alternatives of what to call a biological
translation function (nucleotide to protein):

(a) Just use translate (ignore the existing string method)
(b) Use translate_ (trailing underscore, see PEP8)
(c) Use translation (a noun rather than verb; different style).
(d) Use something else (e.g. bio_translate or ...)
(e) Don't add a biological translation method at all because ...

Thanks,

Peter

See also http://bugzilla.open-bio.org/show_bug.cgi?id=2381
_______________________________________________
BioPython mailing list  -  BioPython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython





More information about the Biopython mailing list