[BioPython] Translation method for Seq object

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Sun Oct 19 15:52:29 UTC 2008


Hi,
  I have been away for 2 weeks but although late, let
me oppose that string.translate() is of use. Here is my
current code:

    # make sure no unallowed chars are present in the sequence
    if type == "DNA":
        if not _sequence.translate(string.maketrans('', ''),'GgAaTtCc'):
            if not _sequence.translate(string.maketrans('', ''),'GgAaTtCcBbDdSsWw'):
                if not _sequence.translate(string.maketrans('', ''),'GgAaTtCcRrYyWwSsMmKkHhBbVvDdNn'):
                    raise ValueError, "DNA sequence contains unallowed characters: " + str(_sequence.translate(string.maketrans('', ''),'GgAaTtCcRrYyWwSsMmKkHhBbVvDdNn'))
                else:
                    _warning = "DNA sequence contains IUPACAmbiguousDNA characters, which cannot be interpreted uniquely. Please try to find sequence of higher quality."
            else:
                _warning = "DNA sequence contains ExtendedIUPACDNA characters. " + str(_sequence.translate(string.maketrans('', ''),'GATC')) + " Please try to find sequence of higher quality."
    elif type == "RNA":
        if not _sequence.translate(string.maketrans('', ''),'GgAaUuCc'):
            if not _sequence.translate(string.maketrans('', ''),'GgAaUuCcRrYyWwSsMmKkHhBbVvDdNn'):
                raise ValueError, "RNA sequence contains unallowed characters: " + str(_sequence.translate(string.maketrans('', ''),'GgAaTtCcRrYyWwSsMmKkHhBbVvDdNn'))
            else:
                _warning = "RNA sequence contains ExtendedIUPACDNA characters. " + str(_sequence.translate(string.maketrans('', ''),'GgAaUuCc')) + " Please try to find sequence of higher quality."
        _sequence = _sequence.translate(string.maketrans('Uu', 'Tt'))
    return (_warning, _type, _description, _sequence)


I would have voted for b) or c).
Martin


Peter wrote:
> Dear Biopythoneers,
> 
> This is a request for feedback about proposed additions to the Seq
> object for the next release of Biopython.  I'd like people to pick (a)
> to (e) in the list below (with additional comments or counter
> suggestions welcome).
> 
> Enhancement bug 2381 is about adding transcription and translation
> methods to the Seq object, allowing an object orientated style of
> programming.
> 
> e.g. Current functional programming style:
> 
>>>> from Bio.Seq import Seq, transcribe
>>>> from Bio.Alphabet import generic_dna
>>>> my_seq = Seq("CAGTGACGTTAGTCCG", generic_dna)
>>>> my_seq
> Seq('CAGTGACGTTAGTCCG', DNAAlphabet())
>>>> transcribe(my_seq)
> Seq('CAGUGACGUUAGUCCG', RNAAlphabet())
> 
> With the latest Biopython in CVS, you can now invoke a Seq object
> method instead for transcription (or back transcription):
> 
>>>> my_seq.transcribe()
> Seq('CAGUGACGUUAGUCCG', RNAAlphabet())
> 
> For a comparison, compare the shift from python string functions to
> string methods.  This also makes the functionality more discoverable
> via dir(my_seq).
> 
> Adding Seq object methods "transcribe" and "back_transcribe" doesn't
> cause any confusion with the python string methods.  However, for
> translation, the python string has an existing "translate" method:
> 
>> S.translate(table [,deletechars]) -> string
>>
>> Return a copy of the string S, where all characters occurring
>> in the optional argument deletechars are removed, and the
>> remaining characters have been mapped through the given
>> translation table, which must be a string of length 256.
> 
> I don't think this functionality is really of direct use for sequences, and
> having a Seq object "translate" method do a biological translation into
> a protein sequence is much more intuitive. However, this could cause
> confusion if the Seq object is passed to non-Biopython code which
> expects a string like translate method.
> 
> To avoid this naming clash, a different method name would needed.
> 
> This is where some user feedback would be very welcome - I think
> the following cover all the alternatives of what to call a biological
> translation function (nucleotide to protein):
> 
> (a) Just use translate (ignore the existing string method)
> (b) Use translate_ (trailing underscore, see PEP8)
> (c) Use translation (a noun rather than verb; different style).
> (d) Use something else (e.g. bio_translate or ...)
> (e) Don't add a biological translation method at all because ...
> 
> Thanks,
> 
> Peter
> 
> See also http://bugzilla.open-bio.org/show_bug.cgi?id=2381



More information about the Biopython mailing list