[Biopython-dev] [Bug 2530] Bio.Seq.translate() treats invalid codons as stops

Peter biopython at maubp.freeserve.co.uk
Sun Jul 20 15:03:48 UTC 2008


> ------- Comment #12 from mmokrejs -------
> Created an attachment (id=973)
>  --> (http://bugzilla.open-bio.org/attachment.cgi?id=973&action=view)
> translate_ESTs.py

Martin,

I had some general comments on your code which you might find helpful.

Most of your variable name start with an underscore - this is very
unusual.  There is a convention in Python that a single leading
underscore is used for private properties or methods of an object.

You used the following code to reverse a string by turning it into a
list and back again:
                _reversed = list(_record.sequence)
                _reversed.reverse()
                _reversed = ''.join(_reversed)

For simply reversing a string, I would suggest using a stride of minus
one instead, reversed_string = old_string[::-1]

You then go on to take the reverse complement (without worrying about
ambiguous characters which could be present, e.g. R -> Y):
                _reversed = list(_record.sequence)
                _reversed.reverse()
                _reversed = ''.join(_reversed)
                _reversed =
_reversed.translate(string.maketrans('AaTtGgCcUu', 'TtAaCcGgAa'), '')

I would suggest using the Bio.Seq.reverse_complement() function here instead.

Finally are you aware of the string formatting operator (%) in python?
 The following code:

_outprothandle.write(''.join(('>', _record.gi, ' ',
_record.definition, ' frame:-3', '\n',
translate(_reversed[2:]).replace('*','X'), '\n')))

might typically be written as:

_outprothandle.write('>%s %s frame:-3\n%s\n" % (_record.gi,
_record.definition, translate(_reversed[2:]).replace('*','X')))

See http://docs.python.org/lib/typesseq-strings.html for more details
(and how to use named insertion points).

Peter



More information about the Biopython-dev mailing list