[Biopython-dev] [Bug 1963] Adding __str__ method to codon tables
and translators
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Sun Feb 26 10:35:32 EST 2006
http://bugzilla.open-bio.org/show_bug.cgi?id=1963
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2006-02-26 10:35 -------
Revised version which:
* Uses the "conventional" nucleotide ordering
* Works for the ambigous tables
* Shows the table's ID and name(s)
Again, add this method to Bio/Data/CodonTable.py class
CodonTable:
def __str__(self) :
"""Returns a simple text representation of the codon table"""
if self.id :
answer = "Table %i" % self.id
else :
answer = "Table ID unknown"
if self.names :
answer = answer + " " + ", ".join(filter(None, self.names))
"""
#Use the conventional ordering for the codon table
#and only use the main four - even for ambiguous tables
letters = self.nucleotide_alphabet.letters
if "T" in letters :
#DNA
letters = "TCAG"
elif "U" in letters :
#RNA
letters = "UCAG"
else :
print "WARNING - Unexpected alphabet"
"""
#Use the conventional ordering for the codon table
letters = self.nucleotide_alphabet.letters
if "GATC" == letters :
#DNA
letters = "TCAG"
elif "GAUC" == letters :
#RNA
letters = "UCAG"
answer=answer + "\n\n |" + "|".join( \
[" %s " % c2 for c2 in letters] \
) + "|"
answer=answer + "\n--+" \
+ "+".join(["---------" for c2 in letters]) + "+--"
for c1 in letters :
for c3 in letters :
line = c1 + " |"
for c2 in letters :
codon = c1+c2+c3
line = line + " %s" % codon
if codon in self.stop_codons :
line = line + " Stop|"
else :
try :
amino = self.forward_table[codon]
except KeyError :
amino = "?"
except TranslationError :
amino = "?"
if codon in self.start_codons :
line = line + " %s(s)|" % amino
else :
line = line + " %s |" % amino
line = line + " " + c3
answer = answer + "\n"+ line
answer=answer + "\n--+" \
+ "+".join(["---------" for c2 in letters]) + "+--"
return answer
Example:
>>> import Bio.Data.CodonTable
>>> print Bio.Data.CodonTable.unambiguous_dna_by_id[11]
Table 11 Bacterial
| T | C | A | G |
--+---------+---------+---------+---------+--
T | TTT F | TCT S | TAT Y | TGT C | T
T | TTC F | TCC S | TAC Y | TGC C | C
T | TTA L | TCA S | TAA Stop| TGA Stop| A
T | TTG L(s)| TCG S | TAG Stop| TGG W | G
--+---------+---------+---------+---------+--
C | CTT L | CCT P | CAT H | CGT R | T
C | CTC L | CCC P | CAC H | CGC R | C
C | CTA L | CCA P | CAA Q | CGA R | A
C | CTG L(s)| CCG P | CAG Q | CGG R | G
--+---------+---------+---------+---------+--
A | ATT I(s)| ACT T | AAT N | AGT S | T
A | ATC I(s)| ACC T | AAC N | AGC S | C
A | ATA I(s)| ACA T | AAA K | AGA R | A
A | ATG M(s)| ACG T | AAG K | AGG R | G
--+---------+---------+---------+---------+--
G | GTT V | GCT A | GAT D | GGT G | T
G | GTC V | GCC A | GAC D | GGC G | C
G | GTA V | GCA A | GAA E | GGA G | A
G | GTG V(s)| GCG A | GAG E | GGG G | G
--+---------+---------+---------+---------+--
>>> print Bio.Data.CodonTable.unambiguous_rna_by_id[1]
Table 1 Standard, SGC0
| U | C | A | G |
--+---------+---------+---------+---------+--
U | UUU F | UCU S | UAU Y | UGU C | U
U | UUC F | UCC S | UAC Y | UGC C | C
U | UUA L | UCA S | UAA Stop| UGA Stop| A
U | UUG L(s)| UCG S | UAG Stop| UGG W | G
--+---------+---------+---------+---------+--
C | CUU L | CCU P | CAU H | CGU R | U
C | CUC L | CCC P | CAC H | CGC R | C
C | CUA L | CCA P | CAA Q | CGA R | A
C | CUG L(s)| CCG P | CAG Q | CGG R | G
--+---------+---------+---------+---------+--
A | AUU I | ACU T | AAU N | AGU S | U
A | AUC I | ACC T | AAC N | AGC S | C
A | AUA I | ACA T | AAA K | AGA R | A
A | AUG M(s)| ACG T | AAG K | AGG R | G
--+---------+---------+---------+---------+--
G | GUU V | GCU A | GAU D | GGU G | U
G | GUC V | GCC A | GAC D | GGC G | C
G | GUA V | GCA A | GAA E | GGA G | A
G | GUG V | GCG A | GAG E | GGG G | G
--+---------+---------+---------+---------+--
Question One:
Is this worth adding to BioPython or not?
Question Two:
What is the preferred behaviour for ambiguous tables? Just a 4x4x4 table as
for the unambiguous tables? Or the full 15x15x15 table? I have implemented
both (see commented out code)
Question Three:
Is there a standard BioPython function to convert from one letter amino acid
sequences into three letter names? i.e. like one_to_three from
Bio.PDB.Polypeptide but more general. That function does not cope with
ambigous names.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list