[BioPython] pickling and assertions in Translate

Andrew Dalke dalke@acm.org
Sat, 13 May 2000 16:19:44 -0600


>    Okay, but then I am also pickling the sequence objects, so that I 
>don't have to retrieve them from the remote connection everytime I run 
>the script. However, if I run the script again and use unPickled 
>objects instead of freshly fetched objects,  I get the following 
>(slightly distrubing :) assertion error from Translate:
>
>File "/usr/local/lib/python1.5/site-packages/Bio/Tools/Translate.py", 
>line 33, in translate_to_stop
>    assert seq.alphabet == self.table.nucleotide_alphabet, \
>AssertionError: cannot translate from given alphabet 
>(have IUPACUnambiguousDNA(), need IUPACUnambiguousDNA())
>


That's a design problem i haven't fixed yet.  I'm testing to make sure
you only pass in the expected alphabet, but I haven't defined how ==
works on alphabets, so the default (which compares the object ids) is
used.  That works if there is only a single instance of an alphabet.
Unpickling creates a new instance, which breaks that assumption.

One solution is to create a __cmp__ method for the alphabets.  The
generic version (assuming there is no object data) is:
  class Alphabet:
    ...
    def __cmp__(self, other):
        self.__class__ == other.__class

If there is state data, then those classes need to defined their own
__cmp__ methods.

This isn't great because derived classes don't work.  Ie, an unambiguous
alphabet can be used almost anywhere an anbiguous one is used.  So the
real assertion test should be more like:
  assert isinstance(seq.alphabet, Alphabet.NucleotideAlphabet)

But this doesn't work if there is any state data.  I don't think there
is for Alphabets - there was, but I moved the gaps and stop codon
encodings to an Encoding class, rather than make them an Alphabet.
(Currently booted in MS Windows, so don't have access to the code just
now.)  The test for == I believe predates this change.  I believe this
is the best solution.

Another solution, which I don't propose, is to define __getstate__
and __setstate__ to return singleton object.

Finally, if you just want to test things, run Python with -O, which
ignores assert statements as one of its two optimizations.

                    Andrew