[Biopython] Question on seq.translate()

Sebastian Bassi sbassi at gmail.com
Sat Jun 3 22:28:21 UTC 2017


>>> from Bio.Seq import Seq
>>> import Bio.Alphabet
>>> seq = Seq('CCGGGTT', Bio.Alphabet.IUPAC.unambiguous_dna)
>>> seq.translate()
/home/sbassi/projects/venvs/biopy169/lib/python3.5/site-packages/Bio/Seq.py:2095:
BiopythonWarning: Partial codon, len(sequence) not a multiple of three.
Explicitly trim the sequence or add trailing N before translation. This may
become an error in future.
  BiopythonWarning)
Seq('PG', IUPACProtein())

So I added two Ns to make if multiple of three. But I got this and I don't
know if this is the intended behavior or not:

>>> seq = Seq('CCGGGTTNN', Bio.Alphabet.IUPAC.unambiguous_dna)
>>> seq.translate()
Traceback (most recent call last):
  File
"/home/sbassi/projects/venvs/biopy169/lib/python3.5/site-packages/Bio/Seq.py",
line 2107, in _translate_str
    amino_acids.append(forward_table[codon])
KeyError: 'TNN'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File
"/home/sbassi/projects/venvs/biopy169/lib/python3.5/site-packages/Bio/Seq.py",
line 1038, in translate
    cds, gap=gap)
  File
"/home/sbassi/projects/venvs/biopy169/lib/python3.5/site-packages/Bio/Seq.py",
line 2124, in _translate_str
    "Codon '{0}' is invalid".format(codon))
Bio.Data.CodonTable.TranslationError: Codon 'TNN' is invalid


I was expecting to have X as an unknown amino-acid, according to this note
in the docstring:

NOTE - Ambiguous codons like "TAN" or "NNN" could be an amino acid
or a stop codon. These are translated as "X". Any invalid codon
(e.g. "TA?" or "T-A") will throw a TranslationError.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170603/d1d296a5/attachment.html>


More information about the Biopython mailing list