[Biopython-dev] Quality scores (and per-letter-annotation) in a SeqRecord?

Peter biopython at maubp.freeserve.co.uk
Mon Feb 23 18:34:18 UTC 2009


Peter wrote:
>> For the implementation, we could start with a simple dictionary and
>> see if any kind of safety feature should be added later if is seems
>> necessary.  What I had in mind was a dict subclass which takes the
>> sequence length, and by overriding the __setitem__ method checks only
>> python sequences (objects with __len__ and __getitem__) of the
>> appropriate length can be added.

On Mon, Feb 23, 2009 at 1:25 PM, Jose Blanca <jblanca at btc.upv.es> wrote:
> I'm not sure how to implement that.

This is what I had in mind, though I haven't properly test it yet:

class RestrictedDict(dict):
    """A dictionary which only allows sequences of given length as values."""
    def __init__(self, length) :
        """Create an EMPTY dictionary."""
        dict.__init__(self)
        self._length = int(length)
    def __setitem__(self, key, value) :
        if not hasattr(value,"__len__") or not hasattr(value,"__getitem__") \
        or len(value) != self._length :
            raise TypeError("We only allow python sequences (lists,
tuples or strings) of length %i." % self._length)
        dict.__setitem__(self, key, value)

x = RestrictedDict(4)
x["test"] = "abcd"
x["test"] = ["a","b",5,None]
x["test"] = (1,2,3,4)
try :
    x["test"] = "abcde" #wrong length
    assert False
except TypeError :
    pass
try :
    x["test"] = 10 #not a sequence
    assert False
except TypeError :
    pass


> What would you think about creating a new
> class based on dict but with an extra property, parent? parent would be a
> reference to the SeqRecord. This new class would check the length of its
> parent before adding the letter_annotation. I'm just asking because I'm
> curious about the best way to implement it.

This could work, and would also mean the length of the sequence would
get updated if the parent SeqRecord's seq property was changed.  On
the other hand, this kind of thing could cause trouble for automatic
garbage collection (because of the circular references between the
objects).  This may not be real problem, but its something I would
worry about.

Peter




More information about the Biopython-dev mailing list