[BioPython] codons and complements

Andrew Dalke dalke@bioreason.com
Tue, 12 Oct 1999 11:52:25 -0600


Thomas.Sicheritz@molbio.uu.se showed an example where having the stop
codon expressed in protein as *something*, is useful
> And in the complete genome sequencing progress - you still want to
> treat stop codon containing sequences as normal sequences until you
> know if it is sequencing error or an ancient gene etc.

Sure, bring the real world into the discussion! :)

Okay, having "*" or "X" or whatever is useful.  Nothing I've said
precludes using it, I just wanted to have a way to make stronger
statements about the alphabets being used, if possible.


> we had some problems in the beginning because of software not
> accepting stop codons in sequences. So we were forced to reinvent
> several wheels :-(

And thank you for being in this discussion, since it helps me, at
least, understand some of the needs people have.


> I still think there must be space for the user/caller to decide
> about biological meanings and how to treat stop codons in the middle
> of CDS (Coding sequences).

Okay.  There doesn't seem to be a standard way to do this, so
how about having the generic method I have (which assumes that
the sequence and the codon table is sufficient), then letting the
user write their own functions to handle the special cases.

The problem is, I can't see any more generic way to do things
which handle the special cases, and I know that what I have is
what most people want.

As I mentioned in my previous email, it does mean that having
"translate" as a method probably makes some sense.  I'm just
still against it.  (For example, it requires that the SWISSPROT
parser for RF2_ECOLI be able to parse the record and create
an object with a .translate() which understands the stop codon
really is translated.  Such a parser would be much more complicated.
OTOH, if the knowledge of how the translation is done is stored
elsewhere, then it makes more sense to use a function also available
elsewhere, and not a method of the object.)


> If I understood Andrews last post this shouldn't be a problem
> with the proposed sequence model.
> .. Or maybe I am completely lost again ... ?

Nope, you got it right.

						Andrew
						dalke@acm.org